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Abstract 

In this Chapter we review our works on force fields for molecular simulations of 
protein systems. We first discuss the functional forms of the force fields and present 
some extensions of the conventional ones. We then present various methods for 
force-field parameter optimizations. Finally, some examples of our applications of 
these parameter optimization methods are given and they are compared with the 
results from the existing force-fields. 
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1 Introduction 

Computer simulations of protein folding into native structures can be achieved 
when both of the following two requirements are met: (1) potential energy func- 
tions (or, force fields) for the protein systems are sufficiently accurate and (2) suffi- 
ciently powerful conformational sampling methods are available. Professor Harold 
A. Scheraga has been one of the most important pioneers in studies of both of the 
above requirements lUIll . By the developments of the generalized-ensemble algo- 
rithms (for reviews, see, e.g., Refs. ||3]|4l|5]|U) and related methods. Requirement (2) 
seems to be almost fulfilled. In this Chapter, we therefore concentrate our attention 
on Requirement (1). 

There are several well-known all-atom (or united-atom) force fields, such as AM- 
BER GllllllllOlin], CHARMM miiniEl, OPLS ESlIIll, GROMOS ifTTlfTSll, 
GROMACS JTHIlOl, and ECEPP l2l][22l. Generally, the force-field parameters are 
determined based on experimental results for small molecules and theoretical results 
using quantum chemistry calculations of small peptides such as alanine dipeptide. 

However, the simulations using different force-field parameters will give differ- 
ent results. We have performed detailed comparisons of three version of AMBER 
(ff94 0, ff96 la, and ff99 g)), CHARMM yjj, OPLS-AA/L 116|, and GROMOS 
IfTTll by generalized-ensemble simulations of two small peptides in explicit solvent. 
||23ll24l We saw that these force fields showed clearly different behaviors especially 
with respect to secondary-structure-forming tendencies. The folding simulations of 
the two peptides with implicit solvent model also showed similar results Il25ll26ll27l . 
For instance, the ff94 Q and ff96 IHl versions of AMBER yield very different 
behaviors about the secondary-structure-forming tendencies, although these force 
fields differ only in the main-chain torsion-energy terms. Many researchers have 
thus studied the main-chain torsion-energy terms and their force-field parameters. 
For example, newer force-field parameters for the main-chain torsion-energy terms 
about (j) and \j/ angles have been developed, which are, e.g., AMBER ff99SB ifTOl . 
AMBER ff03 im, CHARMM22/CMAP mill and OPLS-AA/L [161. The meth- 
ods of the force-field optimization thus mainly concentrate on the torsion-energy 
terms. These modifications of the torsion energy are usually based on quantum 
chemistry calculations |l28l|29l[30l[l3l[l4l[3ll or NMR experimental results II321I33II . 

We have proposed a new main-chain torsion-energy term, which is represented 
by a double Fourier series in two variables, the main-chain dihedral angles and 
y/ ll34l [35l . This expression gives a natural representation of the torsion energy in 
the Ramachandran space 1361 in the sense that any two-dimensional energy surface 
periodic in both and y/ can be expanded by the double Fourier series. We can then 
easily control secondary-structure-forming tendencies by modifying the main -chain 
torsion-energy surface. We have presented preliminary results for AMBER ff94 and 
AMBER ff96 Il34ll35l. 

Moreover, we have introduced several optimization methods of force-field pa- 
rameters Il25ll26ll27l[37l[38l . These methods are based on the minimization of some 
score functions by simulations in the force-field parameter space, where the score 
functions are derived from the protein coordinate data in the Protein Data Bank 
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(PDB). One of the score functions consists of the sum of the square of the force 
acting on each atom in the proteins with the structures from the PDB ||25l |26l |27l . 
Other score functions are taken from the root-mean-square deviations between the 
original PDB structures and the corresponding minimized structures ll37l[38l . 

We have also proposed a new type of the main-chain torsion-energy terms for 
protein systems, which can have amino-acid-dependent force-field parameters ll39l . 
As an example of this formulation, we applied this approach to the AMBER ff03 
force field and determined new amino-acid-dependent main-chain torsion-energy 
parameters for y/ (N-Ca-C-N) and y/' (C|3-Ca-C-N) by using our optimization 
method in Refs l25l l26l l27l . 

In this Chapter, we review our works on protein force fields. In section 2 the 
details of the new main-chain torsion-energy terms and the methods for refinements 
of force-field parameters are given. In section 3 examples of the applications of 
these methods are presented. Section 4 is devoted to conclusions. 



2 Methods 

2.1 General force field for protein systems 

The all-atom force fields for protein systems such as AMBER, CHARMM, OPLS, 
and ECEPP use essentially the same functional forms for the potential energy except 
for minor differences. The commonly used total conformational potential energy 
ficonf is given by 

^conf ^BL ^ A ^ ^torsion ^" ^nonbond ; ( 1 ) 

where 

£bl= l m-^^^f. (2) 

bond length (i 

^BA = 52 ^eiO— Oeq)^ , (3) 
bond angle 6 

^torsion = T + COs{n<P - Yn)] , (4) 

dihedral angle <l> " 

^nonbond 

Here, Zsbl, ^^ba. and istorsion represent the bond-stretching term, the bond-bending 
term, and the torsion-energy term, respectively. The bond-stretching and bond- 
bending energies are given by harmonic terms with the force constants, K( and Kg, 
and the equilibrium positions, £eq and 0eq- The torsion energy is, on the other hand, 
described by the Fourier series in Eq. (|5]l, where the sum is taken over all dihedral 



j1 

J2 



Bij 332qiqj 



'J I J 



en. 



(5) 
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angles 0, n is the number of waves, 7„ is the phase, and V„ is the Fourier coefficient. 
The nonbonded energy in Eq. (|5]l is represented by the Lennard-Jones and Coulomb 
terms between pairs of atoms, ; and j, separated by the distance r,y (in A). The pa- 
rameters Aij and Bij in Eq. (|5]l are the coefficients for the Lennard-Jones term, qi 
(in units of electronic charges) is the partial charge of the /-th atom, and e is the di- 
electric constant, where we usually set e = 1 (the value in vacuum). The factor 332 
in the electrostatic term is a constant to express energy in units of kcal/mol. Hence, 
we have five classes of force-field parameters, namely, those in the bond-stretching 
term (/Q and £eq), those in the bond-bending term {Kq and 0eq), those in the torsion 
term (Vn and 7„), those in the Lennard-Jones term (A,y and Bij), and those in the 
electrostatic term {qi). 

Eq. ([TJ represents a standard set of the potential energy terms. As mentioned 
above, there are minor differences in the energy functions among different force 
fields. For instance, the Urey-Bradley term is used in CHARMM and OPLS, but 
not in AMBER. In our parameter refinement methods, we try to optimize a certain 
set of parameters in the existing force fields without changing the functional forms. 
Therefore, if the original force field has non-standard terms, then the optimized one 
also has them. 



2.2 New torsion-energy terms 

2.2.1 Representation by a double Fourier series ll34ll35l 

Separating the contributions £'(0, of the backbone dihedral angles and \if from 
the rest of the torsion terms firest, we can write the torsion energy term in Eq. (|5]l as 

^torsion =E{(^,\\f)+ Brest , (6) 

where we have 

£(0 , r) = L T + " + ^ T + cos(« r - Yn)] ■ (7) 

m n 

For example, the coefficients for the cases of six force fields namely, AMBER 
parm94, AMBER parm96, AMBER parm99, CHARMM27, OPLS-AA, and OPLS- 
AA/L, are summarized in Table[Tl and we can expHcitly write £(0,1//) in Eq. (|7]) as 
follows: 

£'parm94(0,r) = 2.7 - O.2cos20 - 0.75 COS 1// - L35 cos2 1/ - 0.4cos4v/ (8) 
fiparmgeC^,^) = 2.3 + 0.85 COS0 - 0.3 cos20 + 0.85 COS 1/ - 0.3 cos2v/ , (9) 
^parmggl^,^) = 5.35 + 0.8 COS - 0.85 cos20 - L7 cosy/ -2.0 cos 2)/ j(10) 
£'charmm(0,'/) = 0.8 -O.2COS0 +0.6COS V/ , (11) 
£'oPLS-AA(0,r) = 1.158 - 1.1825cos0-O.456cos20-O.425cos30 
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+ 0.908cosv/-0.611cos2v/ + 0.7905cos3vA, (12) 
£oPLS-AA/L(^,r) = 0.81885 -O.298cos^-O.1395cos20 -2.4565 cos3(/) 

+ 0.3715COSV/- 1.254cos2v/-0.4025cos3v/ . (13) 



Table 1 Torsion-energy parameters for tfie backbone dihedral angles ^ and i// for AMBER 
parm94, AMBER parm96, AMBER parm99, CHARMM27, OPLS-AA, and OPLS-AA/L in 
Eq. 0. 



(j) Y 




m 


v„, 


(kcal/mol) y,„ 


(radians) 


n 


Vn 
? 


(kcal/mol) y,, (radians) 


parm94 


2 




0.2 


71 


1 




0.75 


7t 












2 




1.35 


7t 












4 




0.4 


71 


parm96 


1 




0.85 





1 




0.85 







2 




0.3 


71 


2 




0.3 


71 


parm99 


1 




0.8 





1 




1.7 


71 




2 




0.85 


71 


2 




2.0 


71 


charmm 


1 




0.2 


71 


1 




0.6 





opls-aa 


1 




-1.1825 





1 




0.908 







2 




0.456 


7t 


2 




0.611 


K 




3 




-0.425 





3 




0.7905 





opls-aal 


1 




-0.298 





1 




0.3715 







2 




0.1395 


71 


2 




1.254 


7t 




3 




-2.4565 





3 




-0.4025 






The backbone torsion-energy term E{(j),\i/) in Eq. (|7]i is a sum of two one- 
dimensional Fourier series: one is for (j) and the other for \//. The two variables 
^ and 1/ are decoupled, and no correlation of (p and i// can be incorporated. On 
the other hand, any periodic function of (p and y/ with period 2n can be expanded 
by a double Fourier series. As a simple generalization of £(0, i//), we therefore pro- 
posed to express this backbone torsion energy by the following double Fourier series 

= fl + ^ COSOT0 +c,„sinm0) 

771=1 

+ ^ (ii„cos«v/ + e„sin«v/) 

71=1 

+ ^ ^ (/„,„cosm(/) cosnv/ + g,„„cosm^ sinny/ 

771= 1 77=1 

+ /tm77 sinOT0 cosni^-|- im,j sinm0 sinny/) . (14) 
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Here, m and n are the numbers of waves, a, b„„ c„„ c/,,, e„, /„,„, g„,„, /i,„„, and /„,„ 
are the Fourier coefficients. This equation includes cross terms in and while 
the original term in Eq. (|7]i has no mixing of (p and xj/. Therefore, our new torsion- 
energy term can represent more complex energy surface than the conventional ones. 
The Fourier coefficients, by definition, are given by 

1 f 

c=— d(j) dxj/ <S'{^,Y)x{^t¥) 

CC J—n J—n 

/ TT \ 2 1 /-ISO _ /-ISO ( n . n \ / n ~ n \ 

- (iso) «/.o'^/iso''^<T^^'I^'^)-<T^^'I^'^) ' '''' 

where a are the normalization constants and x(^, i/) are the basis functions for the 
Fourier series. Table |2] summarizes these coefficients and functions. Here, ^ and 
i/A are given in radians, and ^ and ij/ are in degrees = -ffg^, = ifo Here- 
after, angular quantities without tilde and with tilde are in radians and in degrees, 
respectively. 

Table 2 Fourier coefficients c, noiTnalization constants a, and the basis functions for the 

double Fourier series of the backbone torsion energy ((^1, V/) in Eqs. (M) and {BJ. 



c 


a 






a 






1 


h 






cosin(j) 








sin m(p 


dn 


2n^ 




cosni// 


en 


27t^ 




sin n\f/ 


fmn 




cosmlj. 


icosny/ 


Snm 


n- 


cosm< 


p sinnY 


hfnn 




sin m<l 


) cos n ij/ 






sin mi 


p sinni// 



Finally, (#'(0 , y/) in Eq. ( fT4b and Erest in Eq. ^ define our torsion-energy term in 
Eq. ([B (instead of Eq. Q): 

^torsion = +£'rest ■ (16) 

The double Fourier series in Eq. (fl4] | is particularly useful, because it describes 
the backbone torsion-energy surface in the Ramachandran space. The Fourier series 
can express the torsion-energy surface i/) that was obtained by any method 
including quantum chemistry calculations lfT6lE8ll29ll30l[T3l[T4l[3T1 . 

Moreover, one can refine the existing backbone torsion-energy term and con- 
trol the secondary-structure-forming tendencies of the force fields. For example, a- 
helix is obtained for (0, y/) w (-57°, -47°), 3io-helix for (0,1//) w (-49°, -26°), 
TT-helix for (0,1//) « (-57°, -70°), parallel j3-sheet for (0,i//) « (-119°, 113°), 
antiparallel j3-sheet for (0,1//) w (-139°, 135°), and so on ||3ll. Hence, if the ex- 
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isting force field gives, say, too little a -helix-forming tendency compared to exper- 
imental results, one can lower the backbone torsion-energy surface near ij/) = 
(—57°,— 47°) in order to enhance a-helix formations. 
We can thus write 

<^{0,W)^E{(j),w)-f{$,w), (17) 

where E{(^,\if) is the existing backbone torsion-energy term that we want to refine 
and /{([), xj/) is a function that has peaks around the corresponding regions where 
specific secondary structures are to be enhanced. There are many possible choices 
for f{(j),\)f). For instance, one can use the following function when one wants to 
lower the torsion-energy surface in a single region near (0, i/) = (00,^0): 

[ , otherwise , 

(18) 

where A, B, and ro are constants that we adjust for refinement. In this case, the 
energy surface is lowered by f{(j),y/) in a circular region of radius ro, which is 
centered at (0, V'^) = (0o, Vb)- Note that we should also impose periodic boundary 
conditions on /(0, i/a). 

We then express S'{(p,y/) in Eq. ( fTTb in terms of the double Fourier series in 
Eq. (fT4b . where the Fourier coefficients are obtained from Eq. ( fTSl l. Hence, we can 
fine-tune the backbone torsion-energy term by the above procedure so that it yields 
correct secondary-structure-forming tendencies. 

Some remark about the computation time is now in order It may appear that we 
have to expect great increase in computation time by the introduction of the double 
Fourier series, because the number of terms are much larger. However, because most 
of the computation time for the force-field evaluations is spent in the calculations 
of distances between pairs of atoms in the system, the increase in computation time 
due to the double Fourier series is essentially negligible compared to these main 
computational efforts. 



2.2.2 Amino-acid-dependent main-chain torsion-energy terms ||39ll 

By writing the dihedral-angle dependence of the parameters explicitly, we can 
rewrite the torsion-energy term in Eq. ^ as 

where the first summation is taken over all dihedral angles (both in the main 
chain and in the side chains), n is the number of waves, y„ is the phase, and V„ is 
the Fourier coefficient. Namely, the energy term /itorsion has 7„(^') and V„(^') as 
force-field parameters. 
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We can further write the torsion-energy term as 

p _ p(MC) p(SC) 

^torsion — ^torsion ~^ ^torsion ' 

where E^^^^^ and ^t^fj-^io^ are the torsion-energy terms for dihedral angles around 
main-chain bonds and around side-chain bonds, respectively. Examples of the di- 
hedral angles in E^^^n ^re </> (C-N-C„-C), y/ (N-Ca-C-N), 0' (C^-C„-N-C), 

(sc) 

(Cj3-Ca-C-N), and CO (Ca-C-N-Ca)- The force-field parameters in can read- 

ily depend on amino-acid residues. However, those in ^to^sfjj, are usually taken to 
be independent of amino-acid residues and the common parameter values are used 
for all the amino-acid residues (except for proline). This is because the amino-acid 
dependence of the force field is believed to be taken care of by the very existence of 
side chains. In Table|3] we list examples of the parameter values for i// (N-Ca-C-N) 
and \jf' (C|3-Cc(-C-N) in general AMBER force fields. 



Table 3 Torsion-energy parameters (V„ and y,,) for the main-chain dihedral angles \f/ and \f/' in 
Eq. {H) for the original AMBER ff94, ff96, ff99, ff99SB, and ff03 force fields. The values are 
common among the amino-acid residues for each force field. Only the parameters for non-zero V„ 
are listed. 



force field 




V/(N-Ca-C-N) 






V/' (Cp-C„-C-N) 






n 


V„/2 


r« 


n 


V„I2 


r» 


ff94 


1 


0.75 


7t 


2 


0.07 







2 


1.35 


K 


4 


0.10 







4 


0.40 


71 








ff96 


1 


0.85 





2 


0.07 







2 


0.30 


71 


4 


0.10 





ff99 


1 


1.70 


71 


2 


0.07 







2 


2.00 


7t 


4 


0.10 





ff99SB 


1 


0.45 


7t 


1 


0.20 







2 


1.58 


7t 


2 


0.20 







3 


0.55 


7t 


3 


0.40 





ff03 


1 


0.6839 


7t 


1 


0.7784 


Tt 




2 


1.4537 


71 


2 


0.0657 


71 




3 


0.4615 


7t 


3 


0.0560 






However, this amino-acid independence of the main-chain torsion-energy terms 
is not an absolute requirement, because we are representing the entire force field 
by rather a small number of classical-mechanical terms. In order to reproduce the 
exact quantum-mechanical contributions, one can introduce amino-acid dependence 
on any force-field term including the main-chain torsion-energy terms. Hence, we 
can generalize E^^l^ in Eq. (|20] | from the expression in Eq. ( fT9] l to the following 
amino-acid-dependent form: 
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20 

LEE 



{l+cos[«<^-r„(<y]} , (21) 



MC 



(k) 

where ^ 1 , 2, • • • , 20) is the label for the 20 kinds of amino-acid residues and ^y^Q 
are dihedral angles around the main-chain bonds in the A;-th amino-acid residue. 



2.3 Optimization of force-field parameters 

2.3.1 Use offeree acting on each atom with the PDB coordinates 
ll2ll|26i|27j|40J 

In the previous subsection, we presented functional forms of the force fields. Given 
a fixed set of force-field functions, we try to optimize a certain set of parameters 
in the force fields without changing the functional forms. Therefore, if the original 
force field has non-standard terms, then the optimized one also has them. 

Our optimization method for these force-field parameters is now described ll25l . 
We first retrieve native structures (one structure per protein) from PDB. We try to 
choose proteins from different folds (such as all a-helix, all j3 -sheet, a/j3, etc.) and 
different homology classes as much as possible. If the force-field parameters are of 
ideal values, then all the chosen native structures are stable without any force acting 
on each atom in the molecules on the average. Hence, we expect 



where 



and 



F = , (22) 



iV 2 N,„ 



E IT E If' J , (23) 

m=l ■'^'n !,„ = 1 



{m} 



(24) 



Here, A',,, is the total number of atoms in molecule m, E^^l^ is the total potential 
energy for molecule m, x, is the Cartesian coordinate vector of atom ;, and f, is the 
force acting on atom /. In reality, F ^Q, and because F > 0, we can optimize the 
force-field parameters by minimizing F with respect to these parameters. In practice, 
we perform a simulation in the force-field parameter space for this minimization. 

Proteins are usually in aqueous solution, and hence we also have to incorporate 
some kind of solvent effects. Because the more the total number of proteins (A^) 
is, the better the force-field parameter optimizations are expected to be, we want to 
minimize our efforts in the calculations of the solvent effects. Here, we employ the 
generalized-Born/surface area (GB/SA) terms for the solvent contributions 14111421 . 
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Hence, we use in Eq. ( l24l i (we suppress the label m for each molecule) 

Eiol = £^conf + ^solv : (25) 

where 

^soiv = ^'gb +Es,k 1 (26) 




k 



Namely, in the GB/SA model, the total solvation free energy in Eq. (|26b is given by 
the sum of a solute-solvent electrostatic polarization term, a solvent-solvent cavity 
term, and a solute-solvent van der Waals term. A solute-solvent electrostatic polar- 
ization term can be calculated by the generalized Born equation in Eq. (l27T i. where 
Uij = y/aiOj, a, is the so-called Born radius of atom Djj = rfj/{2aij)^, and Cj 
is the dielectric constant of bulk water (we take £.« = 78.3). A solvent-solvent cav- 
ity term and a solute-solvent van der Waals term can be approximated by the term 
in Eq. ( |28] ) that is proportional to the solvent accessible surface area. Here, A^. is 
the total solvent-accessible surface area of atoms of type k and ff<. is an empirically 
determined proportionality constant 14111421 . 

The flowchart of our method for the optimization of force-field parameters is 
shown in Fig.[T| 

In Step 1 of the flowchart we try to obtain as many structures as possible from 
PDB. The number is limited by the computer power that we have available in our 
laboratory. We want to choose proteins with different sizes (numbers of amino 
acids), different folds, and different homology classes as much as possible. We also 
want to use only those with high experimental resolutions. Note that only atomic co- 
ordinates of proteins are extracted from PDB (and coordinates from other molecules 
such as crystal water are neglected). 

If we use data from X-ray experiments, hydrogen atoms are missing, and thus 
in Step 2 we have to add hydrogen coordinates. Many protein simulation software 
packages provide with routines that add hydrogen atoms to the PDB coordinates, 
and one can use one of such routines. 

We now have protein coordinates ready, but usually such "raw data" result 
in very high total potential energy and strong forces will be acting on some of the 
atoms in the molecules. This is because the hydrogen coordinates that we added 
as above are not based on experimental results and have rather large uncertainties. 
The coordinates of heavy atoms from PDB also have experimental errors. We take 
the position that we leave the coordinates of heavy atoms as they are in PDB as 
much as possible, and adjust the hydrogen coordinates to reduce this mismatch. This 
is why we want to include as many PDB data as possible with high experimental 
resolutions (so that the effects of experimental errors in PDB may be minimal). We 
thus minimize the total potential energy ^tot = ^conf + £^soiv + ^constr with respect to 
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1. Seliieve ^native structures ( ooe structure per ]ircitieiii)ficKmPDB 



2. Add hydiDgen atoms if not ai'ailable in FDB 



3. Kefine each itiucture in 1. by iiiiiiimiziiig the total potential energy 
{with, tbe optimized force-field paiiamjeters if already optimized) 
uith respect to theii" coordiaatei with piedefined cooitraints. on coordinates. 



4. Optimize the fiist set of foice-field parametEis by minimiziiig F in Eq. (23) ^calculated 
fiom tlie refined structuieE obraiued ia 3.) with respect to these first iet of panaiueteri. 



5. Rjefine each stmctiHe in 2. hy miiumiztcg the toital potential eoHgy 
(with the optimized force-field paranietei i) mth netpect to their coordioates 
with predefined conitiaitili on cooiilinates 



6. Optimize the second set of force-field pammeters hy mmnni-rina W in Fg (23) (calculated 
from the I'efined stiuctuies obtained in 5.J with respect to these second set of parameters 




No 



Nevi' fbrce-fiield parameters 



Fig. 1 The flowchart of our method for the optimization of force-field parameters. 
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the coordinates for each protein conformation, where £constr is the constraint energy 
term that is imposed on the heavy atoms in PDB (it is referred to as the "predefined 
constraints" in Steps 3 and 5 in Fig.[T]i: 

£constr= ^x(x-Xo)2. (29) 

heavy atom 

Here, Kx is the force constant of the restriction, and xq are the original coordinate 
vectors of heavy atoms in PDB. Because we are searching for the nearest local- 
minimum states, usual minimization routines such as the conjugate-gradient method 
and Newton-Raphson method can be employed here. As one can see from Eq. (|29l ), 
the coordinates of hydrogen atoms will be mainly adjusted, but unnatural heavy- 
atom coordinates will also be modified. We perform this minimization for all 
protein structures separately, and obtain refined structures. 

Given set of "ideal" reference coordinates in Step 3 of the flowchart, we now 
optimize the first set of force-field parameters in Step 4. In Eq. ([T]) we have five 
classes of force-field parameters as mentioned above. Namely, the force-field pa- 
rameters are those in the bond-stretching term (K/i and £eq), those in the bond- 
bending term (Kg and 0eq), those in the torsion term (V„ and y,,), those in the 
Lennard- Jones term (A/y and and those in the electrostatic term {q,). Because 
they are of very different nature, we believe that it is better to optimize these classes 
of force-field parameters separately (as in Steps 4, 6, and so on in Fig. [Til. Note also 
that if we optimize all the parameters simultaneously, the null result (with all the 
parameter values equal to zero) is a solution to Eq. (l22l i. This is the main reason 
why we optimize each class of parameters separately. 

For each set of force-field parameters, the optimization is carried out by min- 
imizing F in Eq. i23[ with respect to these parameters. Here, istot in Eq. jTM 
is given by Eq. i25[ . For this purpose usual minimization routines such as the 
conjugate-gradient method are not adequate, because we need a global optimiza- 
tion. One should employ more powerful methods such as simulated annealing 1431 
and generalized-ensemble algorithms I?)- We perform this minimization simulation 
in the above parameter space to obtain the parameter values that give the global 
minimum of F. 

These processes are repeated until the optimized force-field parameters converge. 
We can, in principle, optimize all the force-field parameters following the flowchart 
in Fig.[T| In the examples given below, however, we just optimize two classes of the 
force-field parameters for simplicity; namely, the partial charges and the backbone 
torsion-energy parameters. For the optimization of the partial charges (qf), we im- 
pose a condition that the total charge of each amino acid remains constant, which 
is the usual assumption adopted by the force fields of Eq. ([T]i based on classical 
mechanics. As for the main chain torsion-energy parameters, we use the following 
functional form for each backbone dihedral angle and y/ (see Eq. (|5]l): 

£<f.=^.,V = y + cos(«„4> - Ya)] + Y [1 + cos(nh0 - Yh)] + y [1 + cos{n,0 - Yc)] ■ 

(30) 
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We optimize only the parameters (y„, Vt,, and Vc) and fix the number of waves («„, 
nb, and n^) and the phases 7^,, and jc) as in the original force field. This torsion- 
energy parameter optimization strongly depends on the values of the force constant 
Kx of the constraint energy in Eq. ( |29l i: The larger the values of are, the larger 
those of Va, Vh, and Vc tend to be. In order to minimize such dependences, we impose 
the constraint that the total area enclosed by the curve of l^^l (from <P = —180° to 
180°) remains less than or equal to the original value during the optimization. 

We believe that these two classes of parameters have the most uncertainty among 
all the force-field parameters. This is because partial charges are usually obtained 
by quantum chemistry calculations of an isolated amino acid in vacuum separately, 
which is a very different condition from that in amino acids of proteins in aqueous 
solution, and because the torsion-energy term is the most problematic (for instance, 
the parm94, parm96, and parm99 versions of AMBER differ mainly in backbone 
torsion-energy parameters). 

Moreover, when we perform the optimizations of force-field parameters by using 
F in Eq. i23[ . we can neglect unnaturally large forces acting on atoms in order to 
remove the errors of PDB structures. Namely, we can exclude the term for f,,„ in 
Eq. ( |23] | that satisfies 

|f/J>/cut. (31) 
We determine the cutoff value /cut by using the following function: 



^RMSD = y i 5^ (^r''™ - (32) 

Here, n is the total number of backbone dihedral angles {(j) and 1// angles) in all 
molecules, 0^"^*= is the i-th backbone dihedral angle of the native structures and 
0mm jjjg corresponding ;-th backbone dihedral angle of the minimized structures 
using the trial force-field parameters. The optimal value of /cut is chosen so that 
0RMSD is the minimal value with /cut < /c™^", where /^f is obtained in an appro- 
priate way (see an example below). 



2.3.2 UseofRMSDlOSl 

We now describe another second method for optimizing the force-field parameters. 
We use proteins again from PDB, which can be the same proteins as those that 
we used in the previous optimization method. If the force-field parameters are of 
ideal values, we expect that all the chosen native structures minimized by the ideal 
force field do not change after minimizations. Namely, we believe that force-field 
parameters are better, if they have smaller deviations obtained by minimizations of 
protein structures. Hence, we expect 

R^O, (33) 



where 
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N 



Y^RMSDi 



i=l 



(34) 



Here, RMSDi is the root-mean-square deviation between the native structure of pro- 
tein / and the corresponding minimized structure using the trial force-field param- 
eters. In reality, ^ 7^ 0, and because /? > 0, we expect that we can optimize the 
force-field parameters by minimizing R with respect to these force-field parameters. 
In practice, we perform a simulation in the force-field parameter space for this min- 
imization. Namely, in the previous method we minimize F in Eq. ( l23T l. and in the 
present method we minimize R in Eq. ( l34l i instead. 

2.3.3 Useof RMSDIIOtII 

We now describe our third method for optimizing the force-field parameters. We 
first select proteins from PDB as in the previous two methods. If the force-field 
parameters are of ideal values, we expect that all the chosen native structures min- 
imized by the ideal force field do not change. Namely, we believe that force-field 
parameters are better, if they have lower deviations obtained from minimizations of 
protein structures. Hence, we expect 



Here, n is the total number of backbone dihedral angles (0 and \^ angles) in 
all molecules, ^>"^"^'' is the ;-th backbone dihedral angle of the native structures 
and 0™" is the corresponding /-th backbone dihedral angle of the minimized 
structures using the trial force-field parameters. In reality, ^>RMSD 7^ 0, because 
0RMSD > 0, we expect that we can optimize the force-field parameters by min- 
imizing 4>RMSD with respect to these force-field parameters. In practice, we per- 
form a simulation in the force-field parameter space for this minimization. 

However, our first aim is to determine the balance of secondary-structure-forming 
tendencies such as helix structure and )3 -sheet structure. Additionally, it is difficult 
to perform the minimization of ^>RMSD in wider force-field paramter space until 
0RMSD is close to because of the computational cost. Therefore, we only fo- 
cus on secondary-structure regions of helix structure and j3 -sheet structure in the 
amino-acid sequence. Namely, we only consider the backbone dihedral angles of 
residues in the native structures which are identiffied by the DSSP program 1441 
that they constitute one of a-helix, 3/10-helix, TT-helix, and j3-sheet structures. We 
calculate two kinds of 0RMSD for secondary structures, namely, ^>RMSDheiix and 
0RMSD^. Here, ^RMSDheiix stands for 0RMSD of backborn dihedral angles of 
residues which have helix structures in the native structures, and 0RMSD|3 means 



0RMSD = 0, 



(35) 



where 




(36) 
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that of only j3 -sheet structures in the native structures. Using these two 4>RMSDs, 
we want to optimize the torsion-energy parameters, which will have better balance 
of secondary-structure-forming tendencies. We propose the following combination: 



where we have introduced a fixed scaling factor A . 

Finally, by minimizing ^>RMSD2ndiy with respect to the force-field parameters, 
we can obtain the optimized force-field parameters. 

2.3.4 Use of short MD simulations HU 

We now describe our fourth method for optimizing the force-field parameters. In 
this method, we prepare M protein structures, which are some experimentally deter- 
mined conformations. For these proteins, we perform MD simulations, which start 
from the experimental conformations, by using a trial force field. We try to perform 
MD simulations with varied values of force-field parameters. After that, we estimate 
the "5"' value defined by the following function for the trajectories of the M proteins 
obtained from the trial MD simulations: 



Here, nf is the number of the amino acids in protein / where their structures in 
PDB (initial conformation) had some secondary structures (such as a-helix, 3io- 
helix, TT-helix, and j3 structures) but transformed into unstructured, coil structures 
without any secondary structures after a short MD simulation. Likewise, n^^^ is is 
the number of amino acids in protein ; where their structures in PDB had coil struc- 
tures but transformed to have some secondary structures after a MD simulation. A^^ 
is the total number of amino acids in protein / which have some secondary struc- 
tures in PDB, and A^^ is the total number of amino acids in protein / which have coil 
structures in PDB. 

When we calculate the S values for the conformations obtained from MD sim- 
ulations by using trial force-field parameters, the parameter set, which yields the 
minimum S value, is considered to give the optimized force field. 



0RMSD2ndiy = A<5RMSDhciix + <?RMSD/j, 



(37) 




(38) 
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3 Examples of Optimizations of Force-Field Parameters 

3.1 New torsion-energy terms 

3.1.1 Representation by a double Fourier series ll34ll35l 

We now present various examples of our refinements of force-field parameters. We 
first consider the following truncated double Fourier series (see Eq. ( fT4b : 

= a + b\ cos(j) +ci sin^ +b2Cos2(p +C2sin20 +biCos3(p +C3sin30 
+ di cos \// + ei sin 1/ + cos 2 1//' + 62 sin2\// + ii3Cos3v/^ + e3 sinSy/' 
+ /ii cos 0COS + cos sin + 11 sin (p cosy/ + in sin sin y/ 
+ /21 cos2(/)cos v/ + g2i COS 20 sin i// + /i2i sin20cos + /21 sin 20 sin y/ 
+ /i2cos0cos2v/ + gi2cos0 sin2i// + /!i2sin0cos2v/ + /i2sin0 sin2v/ 
+ fii cos2(/)cos2i// + ^22cos20sin2v/' 

+ /i22sin20cos2v/ + ;22sin20sin2v/ . (39) 

This function has 29 Fourier-coefficient parameters. We will see below that this 
number of Fourier terms is sufficient for most of our purposes. 

We first check how well the truncated Fourier series in Eq. (l39T l can reproduce the 
six original backbone torsion-energy terms in Eqs. dSt-dlJl). Because these functions 
are already the sum of one-dimensional Fourier series and subsets of the double 
Fourier series in Eq. (fT4b . the Fourier coefficients in Eq. ( fTSl l can be analytically 
calculated and agree with those in Eqs. (l8Tl-(fT3Tl except for the last one (that for 
cos4\//) in Eq. This term is missing in Eq. ( [39] l. These cases thus give us good 
test of numerical integrations in Eq. ( fTSl ). The numerical integrations were evaluated 
as follows. We divided the Ramachandran space (-180° < < 180°, -180° < 
iff < 180°) into unit square cells of side length e (in degrees). Hence, there are 
(360/e)^ unit cells altogether The double integral on the right-hand side of Eq. ( fTsT i 
was approximated by the sum of [S' {-^^,^yf)x[-^^,-^yf)] x (e) , where 
each S' ( ^ , ^ yf) x ( ^ ^,-^yf) was evaluated at one of the four corners of each 
unit cell. We tried two values of e (1° and 10°). Both cases gave almost complete 
agreement of Fourier coefficients with the resutls of the analytical integrations (see, 
for example. Tables |4]below). 

In Fig. |2] we compare the six original backbone torsion-energy surfaces with 
those of the corresponding double Fourier series in Eq. ( [39] l. Hereafter, the primed 
labels for figures such as (a') indicate that the results are those of the double Fourier 
series. As can be seen from Fig. |2] the backbone torsion-energy surfaces are in 
complete agreement for all force fields except for AMBER parm94, whereas we see 
a little difference for AMBER parm94 between Figs. |2ja) and|2a'). As discussed 
above, this slight difference for AMBER parm94 reflects the fact that the cos4v/ 
term in Eq. ([8]) is missing in the truncated double Fourier series in Eq. ( |39] |. 
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Table 4 Fourier coefficients in Eq. ii9) obtained from the numerical evaluations of the integrals in 
Eq. {15). "org94" stands for the original AMBER parm94 force field. "mod94(a)" and "mod94(/3)" 
stand for AMBER parm94 force fields that were modified to enhance a-helix structures and J3- 
sheet stiTJCtures, respectively, by Eqs. {TT) and HSt . The bin size e is the length of the sides of each 
unit square cell for the nuinerical integration in Eq. {Tsj. 



bin size e 




1° 






10° 




coefficient 


org94 


mod94(a) 


mod94(/3) 


org94 


mod94(a) 


mod94(/3) 


a 


2.700000 


2.308359 


1.916719 


2.700000 


2.308370 


1.916742 


hi 


0.000000 


-0.330937 


0.781150 


0.000000 


-0.331053 


0.781041 


Cl 


0.000000 


0.509599 


0.930938 


0.000000 


0.509517 


0.930809 


bi 


-0.200000 


-0.101549 


-0.115937 


-0.200000 


-0.101513 


-0.115970 


Cl 


0.000000 


0.221123 


-0.476745 


0.000000 


0.221100 


-0.476558 


bs 


0.000000 


-0.018073 


0.031693 


0.000000 


-0.018084 


0.031714 


C3 


0.000000 


-0.002862 


-0.018298 


0.000000 


-0.003036 


-0.018310 


di 


-0.750000 


-1.164401 


-0.052959 


-0.750000 


-1.164500 


-0.052874 


ei 


0.000000 


0.444390 


-0.995478 


0.000000 


0.444289 


-0.995599 


d2 


-1.350000 


-1.333115 


-1.184428 


-1.350000 


-1.333073 


-1.184340 


ei 


0.000000 


0.241460 


0.454905 


0.000000 


0.241451 


0.455147 


d3 


0.000000 


-0.014220 


0.035349 


0.000000 


-0.014143 


0.035324 


es 


0.000000 


-0.011515 


0.009472 


0.000000 


-0.011671 


0.009465 


/ll 


0.000000 


-0.342789 


-0.680493 


0.000000 


-0.343087 


-0.680497 


8n 


0.000000 


0.367596 


0.971845 


0.000000 


0.367697 


0.971851 


hn 


0.000000 


0.527849 


-0.810980 


0.000000 


0.527949 


-0.810985 


ill 


0.000000 


-0.566049 


1.158199 


0.000000 


-0.565751 


1.158206 


fzi 


0.000000 


0.090016 


-0.064642 


0.000000 


0.090168 


-0.064636 


g2\ 


0.000000 


-0.096530 


0.092318 


0.000000 


-0.096472 


0.092309 




0.000000 


0.202178 


0.366601 


0.000000 


0.202421 


0.366565 


h\ 


0.000000 


-0.216810 


-0.523561 


0.000000 


-0.216596 


-0.523509 


hi 


0.000000 


0.012329 


-0.142682 


0.000000 


0.012385 


-0.142712 


812 


0.000000 


0.176308 


-0.392017 


0.000000 


0.176622 


-0.392098 


hi2 


0.000000 


-0.018984 


-0.170042 


0.000000 


-0.019013 


-0.170077 


hi 


0.000000 


-0.271490 


-0.467187 


0.000000 


-0.271321 


-0.467284 


f22 


0.000000 


-0.000586 


-0.002453 


-0.000001 


-0.000585 


-0.002451 


g22 


0.000000 


-0.008378 


-0.006738 


0.000000 


-0.008397 


-0.006733 


h22 


0.000000 


-0.001316 


0.013909 


0.000000 


-0.001317 


0.013897 


ino 


0.000000 


-0.018817 


0.038215 


0.000000 


-0.018867 


0.038183 



We now consider the double Fourier series of non-trigonometric functions. The 
functions are those in Eqs. ( fTTb and (fTSl l. We try to fine-tune the six original force 
fields by subtracting /((/), i/) in Eq. ( fTSl l from the original functions. The criterion 
for fine-tuning is, for instance, whether the refined force fields yield better agree- 
ment of the secondary-structure-forming tendencies with experimental implications 
than the original ones. For this we need good experimental data. Because the pur- 
pose here is to test whether or not we can control the secondary-structure-forming 
tendencies, we simply consider extreme cases where we try to modify the existing 
force fields so that desired secondary structures may be obtained regardless of the 
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Fig. 2 Backbone-torsion-energy surfaces of six force fields. The backbone dihedral angles ^ and 
iff are in degrees, (a), (b), (c), (d), (e), and (f) are those of the original AMBER parm94, the original 
AMBER parm96, the original AMBER parm99, the original CHARMM 27, the original OPLS- 
AA, and the original OPLS-AA/L, respectively, (a') to (f) are those of (a) to (f), respectively, that 
were expressed by the truncated double Fourier series in Eq. )39t . The contour lines are drawn 
every 0.5 kcal/mol. 



tendencies of the original force fields. Note that the six original force fields have 
quite different preferences for a-helix and j3-sheet structures Il23ll24ll25ll26ll27l . 

The function 1//) inEq. (fTSll reduces the value of E{(j),\if) inacircle of radius 
ro with the center located at (^o, Wo)- We used ro = 100° and B = 5,000 (degrees)^. 
The coefficient A is calculated by Eq. ( fTSl l from the other parameters /(0o, Wo), r^, 
and B. Namely, we have 

A=/(^o,ro)exp (^^^ . (40) 

We used (^o,V>o) = (-57°, -47°) and (0o,W)) = (-130°, 125°) in order to en- 
hance a -helix-forming tendency and j3 -sheet-forming tendency, respectively. The 
central values /(^OiVA)) that we used were 3.0 kcal/mol and 6.0 kcal/mol for en- 
hancing a-hehx and j3 -sheet, respectively, in the case of AMBER parm94, AMBER 
parm99, CHARMM27, and OPLS-AA/L. They were both 3.0 kcal/mol in the case 
of AMBER parm96 and OPLS-AA. 
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We remark that the large value of f{^, i/zb), 6.0 kcal/mol, that was necessary to 
enhance j3 -sheet in the case of AMBER parm94, AMBER parm99, CHARMM27, 
and OPLS-AA/L reflects the fact that their original force fields favor a-helix. 

In Fig.|3lal)-(fl) we compare the six backbone torsion-energy surfaces modified 
according to Eq. ( fTTI i. which reduced the torsion energy in the a-helix region, with 
those of the corresponding double Fourier series in Eq. (|39] |. In Fig.[3lal)-(fl), a- 
helix is enhanced from the original AMBER parm94 (al), AMBER parm96 (bl), 
AMBER parm99 (cl), CHARMM27 (dl), OPLS-AA (el), and OPLS-AA/L (fl). 
In Fig.|4jal)-(fl) we show the case of the j3-sheet region, and j3-sheet is enhanced 
from the original AMBER parm94 (al), AMBER parm96 (bl), AMBER parm99 
(cl), CHARMM27 (dl), OPLS-AA (el), and OPLS-AA/L (fl). 




Ill , r^I ijl 111 

m tCti iSii 



Fig. 3 Backbone-torsion-energy surfaces of six force fields that were modified by Eqs. {TTJ, ([18] 
and 02)). From (al) to (fl) are those of AMBER parm94, AMBER parm96, AMBER parm99, 
CHARMM 27, OPLS-AA, and OPLS-AA/L force fields that were modified to enhance a-helix 
stmctures, respectively. From (al') to (fl') are those of AMBER parm94, AMBER parm96, AM- 
BER parm99, CHARMM 27, OPLS-AA, and OPLS-AA/L force fields that were expanded by the 
truncated double Fourier series in Eq. )39t . 



These modified backbone torsion-energy functions were expanded by the trun- 
cated double Fourier series in Eq. ( |39] l by evaluating the corresponding Fourier co- 
efficients from Eq. ( fTSb . For the numerical integration we again tried two values 
of the bin size £ (1° and 10°). The obtained Fourier coefficients are summarized 
in Tables |4] for example, in the case of AMBER parm94. For comparisons, the 
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Fig. 4 Backbone-torsion-energy surfaces of six force fields that were modified by Eqs. (TtJ, jl8| 
and OD)- From (al) to (fl) are those of AMBER parm94, AMBER parm96, AMBER parm99, 
CHARMM 27, OPLS-AA, and OPLS-AA/L force fields that were modified to enhance /3 -sheet 
stmctures, respectively. From (al') to (fl') are those of AMBER parm94, AMBER parm96, AM- 
BER parm99, CHARMM 27, OPLS-AA, and OPLS-AA/L force fields that were expanded by the 
truncated double Fourier series in Eq. )39t . 



Fourier coefficients of the original AMBER force fields (before modifications) are 
also listed. We see that the two choices of the bin size e gave essentially the same 
results (agreeing in about 3 digits). 

In Figs.[3jar)-(fr) and|4jar)-(fr) we show the backbone torsion-energy sur- 
faces represented by the truncated double Fourier series. Comparing these with the 
original ones in Fig. [3al)-(fl) and|4jal)-(fl), we find that the overall features of 
the energy surfaces are well reproduced by the Fourier series. If more accuracy is 
desired, we can simply increase the number of Fourier terms in the expansion. As 
we will see below, the present accuracy of the Fourier series was sufficient for the 
purpose of controlling the secondary-structure-forming tendencies towards a-heUx 
or /3 -sheet. 

We examined the effects of the above modifications of the backbone torsion- 
energy terms in AMBER parm94, AMBER parm96, AMBER parm99, CH ARMM27, 
OPLS-AA, and OPLS-AA/L (towards specific secondary structures) by perform- 
ing the folding simulations of two peptides, namely, C-peptide of ribonuclease A 
and the C-terminal fragment of the B 1 domain of streptococcal protein G, which 
is sometimes referred to as G-peptide 1461 . The C-peptide has 13 residues and its 
amino-acid sequence is Lys-Glu-Thr-Ala-Ala-Ala-Lys-Phe-Glu-Arg-Gln-His-Met. 
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This peptide has been extensively studied by experiments and is known to form an 
a-helix structure ll47ll48l . as shown in Fig.|5la). Because the charges at peptide ter- 
mini are known to affect helix stability ll47ll48ll . we blocked the termini by a neutral 
COCH3- group and a neutral -NH2 group. The G-peptide has 16 residues and its 
amino-acid sequence is Gly-Glu-Trp-Thr-Tyr-Asp-Asp-Ala-Thr-Lys-Thr-Phe-Thr- 
Val-Thr-Glu. The termini were kept as the usual zwitter ionic states, following the 
experimental conditions B6l l49l l50l . This peptide is known to form a /3 -hairpin 
structure by experiments B6ll49ll50l . as shown in Fig.jSjb). 




Simulated annealing (43l MD simulations were performed for both peptides from 
fully extended initial conformations, where the 12 versions of the truncated double 
Fourier series (which were described in Tableland in Fig.[3al')-(fl') and|4jal')- 
(fl')) were used for the backbone torsion-energy terms of AMBER parm94, AM- 
BER parm96, AMBER parm99, CHARMM27, OPLS-AA, and OPLS-AA/L force 
fields. For comparisons, the simulations with the original force fields were also per- 
formed. The unit time step was set to 1.0 fs. Each simulation was carried out for 1 
ns (hence, it consisted of 1,000,000 MD steps). The temperature during MD simula- 
tions was controlled by Berendsen's method 152]. For each run the temperature was 
decreased exponentially from 2,000 K to 250 K. We modified and used the program 
package TINKER version 4.1 lf53l for all the simulations. As for solvent effects, 
we used the GB/S A model BTl l42l included in the TINKER program package. For 
both peptides, these folding simulations were repeated 60 times with different sets 
of randomly generated initial velocities. 

In Fig. |6] we show seven (out of 60) lowest-energy final conformations of C- 
peptide and G-peptide obtained by the simulated annealing MD simulations, for 
example, in the case of AMBER parm94. 

In the Figure, we see that all conformations of the original AMBER parm94 
(except for conformations 2 and 4 of G-peptide) and all conformations of its force 
field modified towards a-helix are a-helix structures (conformations 2 and 4 are 



22 



Yoshitake Sakae and Yuko Okamoto 




Fig. 6 Seven lowest-energy final conformations of C-peptide (a)-(a") and G-peptide (b)-(b") ob- 
tained from six sets of 60 simulated annealing MD runs, (a) and (b) are the results of the original 
AMBER parm94. (a') and (b') are the results of AMBER parm94 of the truncated double Fourier 
series of six force fields that were modified to enhance a-helix stmctures. (a") and (b") are the 
results of AMBER parm94 of the truncated double Fourier series of six force fields that were 
modified to enhance /3 -sheet structures. The conformations are ordered in the increasing order of 
energy for each case. The figures were created with DS Visualizer vl.5 l54l . 
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3io-helix structures). The results show that the original AMBER parm94 favors a- 
heUx structures, and moreover, its force field modified towards a-helix favors 0£- 
helix structures more than the original force field in the sense that the obtained 
helices are more extended (and almost entirely helical). On the other hand, AMBER 
parm94 modified towards j3 -sheet favors )3 structures strongly. The results for other 
force fields were similar. 

Therefore, regardless of the secondary-structure-forming tendencies of the orig- 
inal force fields, our modifications of the backbone torsion-energy term succeeded 
in enhancing the desired secondary structures. 

3.1.2 Amino-acid-dependent main-chain torsion-energy terms ll39l 

We present the results of our optimizations of the force-field parameters V\{^^q) 
for the main-chain angles ^^^^ = v/f*^) (N-Ca-C-N) and i/''*^' (Cp-Ca-C-N) in 
Eq. dlB- We did this for the case of AMBER ff03 force field. We determined these 
^i(^Mc) values for the 19 amino-acid residues except for proline. 

At first, we chose 100 PDB files with resolution 2.0 A or better, with sequence 
similarity of amino acid 30.0 % or lower, and with less than 200 residues (the aver- 
age number of residues is 117.0) from PDB -REPRDB EH (see Tableland Fig.]?]). 
We then refined these selected 100 structures. Generally, data from X-ray experi- 
ments do not have coordinates for hydrogen atoms. Therefore, we have to add hydro- 
gen coordinates. Many protein simulation software packages provide with routines 
that add hydrogen atoms to the PDB coordinates. We used the AMBERl 1 program 
package ||56) . We thus minimized the total potential energy fitotai = £^conf + ^soiv + 
^constr with respect to the coordinates for each proten conformation, where /iconstr 
is the harmonic constraint energy term (-fi'constr — ^heavy atom 

Kxin-x^Y), and Esoiv 

is the solvation energy term. Here, Kx is the force constant of the restriction and 
xo are the original coordinate vectors of heavy atoms in PDB. As one can see from 
£^constr, the coordinates of hydrogen atoms will be mainly adjusted, but unnatural 
heavy-atom coordinates will also be modified. We performed this minimization for 
all the 100 protein structures separately and obtained 100 refined structures by us- 
ing Kx = 100 (kcal/mol). As for the solvation energy term E^oXv, we used the GB/SA 
solvent included in the AMBER program package {igb ~ 5 and gbsa = 1) ||57ll58l . 

For these refined protein structures, we performed the optimization of force-field 
parameters Vj of i/a and \\r angles for AMBER ff03 force field by using the fucn- 
tion F in Eq. ( |23] | as the total potential energy function (istotai = ^conf +£soiv) for the 
Monte Carlo simulations in the parameter space. Here, we used AMBERl 1 ll56l for 
the force calculations in Eq. (l24l l. We have to optimize the 38 (= 2 x 19) parameters 
simultaneously by the simulations in 38 parameters. However, here, for simplic- 
ity, we just optimized two parameters, Vi (i/''^') and Vi (i//'^*'), for each amino-acid 
residue k separately, keeping the other Vi values as the original values. In order to 
obtain the optimal parameters, we performed Monte Carlo simulations of two pa- 
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Table 5 100 proteins used in the optimization of force-field parameters. 



fold 


PDB ID 


chain 


PDB ID 


chain 


PDB ID 


chain 


PDB ID 


chain 


all a 


IDLW 


A 


INIJ 


B 


1U84 


A 


IHBK 


A 




1TX4 


A 


1V54 


E 


1SK7 


A 


ITQG 


A 




1V74 


B 


IDVO 


A 


IHFE 


S 


IJOP 


A 




1Y02 


A71-114 


IIJY 


A 


1I2T 


A 


1G8E 


A 




IVKE 


C 


IFSl 


AI09-I49 


1D9C 


A 


lAIL 


A 




1Q5Z 


A 


1T8K 


A 


IOR7 


C 


1NG6 


A 




1C75 


A 


2LIS 


A 


INH2 


B 


1Q2H 


A 




INKP 


A 














all (8 


IXAK 


A 


1T2W 


A 


IGMU 


CI -70 


lAYO 


A 




1PK6 


A 


INLQ 


C 


IBEH 


A 


1UA8 


A 




lUXZ 


A 


1UB4 


C 


ILGP 


A 


ICQY 


A 




1PM4 


A 


10U8 


A 


1V76 


A 


1UT7 


B 




10A8 


D 


IIFG 


A 










a/j8 


IIOO 


A 


1U7P 


A 


IJKE 


C 


IMXI 


A 




ILYl 


A 


INRZ 


A 


1IM5 


A 


IVCl 


A 




lOGD 


A 


IIIB 


A 


IPYO 


D 


IMUG 


A 




1H75 


A 


1K66 


A 


ICOZ 


A 


ID40 


A 


a + ji 


IVCC 


A 


IPPO 


B 


IPZ4 


A 


ITUl 


A 




1Q2Y 


A 


1M4J 


A 


1N9L 


A 


ILQV 


B 




1A3A 


A 


1K2E 


A 


1TT8 


A 


IHUF 


A 




ISXR 


A 


ICYO 


A 


IKAF 


A 


UDO 


A 




lUCD 


A 


1F46 


B 


IKPF 


A 


IBYR 


A 




1Y60 


D 


ISEI 


A 


IRL6 


A 


1WM3 


A 




IFTH 


A 


lAPY 


B 


IJID 


A 


1N13 


E 




ILTS 


C 


IJYO 


F 


IE87 


A 


lUGI 


A 




IMWP 


A 


IPCF 


A 


IMBY 


A 


IIHR 


B 



1H6H A 



rameters (Vi of y^f and for the 19 amino-acid residues except for proline. In Table 
|6] the optimized parameters are listed. 

In order to check the force-field parameters obtained by our optimization method, 
we performed the folding simulations using two peptides, namely, C-peptide and G- 
peptide. 

For the folding simulations, we used replica-exchange molecular dynamics (REMD) 
||59ll . REMD is one of the generalized-ensemble algorithms, and has high confor- 
mational sampling efficiency by allowing configurations to heat up and cool down 
while maintaining proper Boltzmann distributions. We used the AMBERl 1 program 
package ll56l . The unit time step was set to 2.0 fs, and the bonds involving hydrogen 
atoms were constrained by SHAKE algorithm f60l. Each simulation was carried out 
for 30.0 ns (hence, it consisted of 15,000,000 MD steps) with 16 replicas by using 
Langevin dynamics. The exchange procedure for each replica were performed every 
3,000 MD steps. The temperature was distributed exponentially: 650, 612, 577, 544, 
512, 483, 455, 428, 404, 380, 358, 338, 318, 300, 282, and 266 K. As for solvent 
effects, we used the GB/SA model in the AMBER program package {igb = 5 and 
gbsa = 1) 15711581 . The initial conformations for each peptide were fully extended 
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ones for all the replicas. The REMD simulations were performed with different sets 
of randomly generated initial velocities for each replica. 

In Fig. [8] a-helicity and j3-strandness of the two peptides obtained from the 
REMD simulations are shown. We checked the secondary-structure formations by 
using the DSSP program 1441 . which is based on the formations of the intra-main- 
chain hydrogen bonds. As is shown in Fig. [8j for the original AMBER ff03 force 
field, the a-helicity is clearly higher than the j3-strandness not only in C-peptide 
but also in G-peptide. Namely, the original AMBER ff03 force field clearly favors 
0£-helix and does not favor /3 -structure. On the other hand, for the optimized force 
field, in the case of C-peptide, the a-helicity is higher than the j3-strandness, and in 
the case of G-peptide, the j3-strandness is higher than the a-helicity. We conclude 
that these results obtained from the optimized force field are in better agreement 
with the experimental results in comparison with the original force field. In Fig.|9l 
3io-helicity and 7r-helicity of two peptides obtained from the REMD simulations 
are shown. For 3 lo helicity, there is no large difference for both force fields in C- 
peptide, and in the case of G-peptide, the value of the optimized force field slightly 
decreases in comparison with the original force field. 7r-helicity has almost no value 
in the both cases of the original and optimized force fields in two peptides. 
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Table 6 Optimized Vi /2 parameters for tlie main-chain dihedral angles and iff' for the 19 amino- 
acid residues (except for proUne) in Eq. i2l\ . The rest of the parameters are taken to be the same 
as in the original ff03 force field. The original amino-acid-independent values are also listed for 
reference. 





V (N-Ca-C-JN) 


V (<-(3-Cc(-C-N) 


original ff03 


0.6839 


0.7784 


Ala 


0.122 


0.150 


Arg 


0.409 


0.200 


Asn 


-0.074 


-0.162 


Asp 


-0.137 


0.182 


Cys 


0.361 


0.089 


Gin 


0.144 


-0.024 


Glu 


0.180 


0.152 


Gly 


0.258 




His 


0.020 


0.237 


De 


0.643 


0.194 


Leu 


0.382 


0.257 


Lys 


0.222 


0.042 


Met 


0.141 


0.346 


Phe 


-0.010 


0.553 


Set 


-0.248 


0.475 


Thr 


0.512 


0.328 


Tip 


0.027 


0.477 


Tyr 


0.082 


0.652 


Val 


0.142 


0.590 



In Fig.[TOl a-helicity and )3-strandness as functions of temperature for the two 
peptides obtained from the REMD simulations are shown. For a-helicity, the values 
of both force fields decrease gradually from low temperature to high temperature 
in the case of C-peptide. On the other hand, in the case of G-peptide, there are 
small peaks at around 300 K and 358 K for the original and optimized force fields, 
respectively. For /3-strandness, in the case of C-peptide, it is almost zero for both 
force fields. In the case of G-peptide, for the optimized force field, there is clearly a 
peak around 300 K. 



3.2 Optimization of force-field parameters 

3.2.1 Use offeree acting on each atom in the PDB coordinates Il25ll26ll27ll40l 

We now present the results of our force-field optimizations. In Step 1 of the 
flowchart in Fig. [T] we chose 100 PDB files (A^ = 100) from X-ray experiments 
with resolution 1.8 A or better and with less than 200 residues (the average num- 
ber of resiudes is 120.4) from PISCES ED- Their PDB codes are 2LIS, lEPO, 
ITIF, 1EB6, ICIL, ICCW, 2PTH, 1I6W, IDBF, IKPF, ILRI, lAAP, 1C75, 1CC8, 
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Fig. 8 a-lielicity (a-1) and /3-strandness (a-2) of C-peptide and a-iielicity (b-1) and j8-strandness 
(b-2) of G-peptide as functions of the residue number at 300 K. These values were obtained from 
the REMD simulations. Normal and dotted curves stand for the optimized and the original AMBER 
ff03 force fields, respectivery. 



1FK5, IKQR, IKIE, ICZP, IGPO, IKOI, IIQZ, 3EBX, 1140, lEJG, lAMM, 1107, 
1GK8, IGVP, 1M4I, lEYV, 1E29, 1I2T, IVCC, IFMO, lEXR, IGUT, 1H4X, 
IGBS, IBOB, 119L, IIFC, IDLW, lEAJ, IGGZ, 1JR8, 1RB9, IVAP, IJZG, 1M55, 
1EN2, 1C90, 2ERL, lEMV, 1F41, 1EW6, 2TNE HER, USE, IKAE IHZT, IHQK, 
IFXL, IBKR, IIDO, ILQV, 1G2R, 1KR7, IQTN, 1D40, lEAZ, 2CY3, lUGI, IIJV, 
3VUB, IBZP, IJYR, IDZK, IQET, lUTG, 2CPG, 1I6W, 1C7K, 1180, 1L07, ILNI, 
lEQO, INDD, 1HD2, 3PYP 1FD3, 1DK8, IWHI, lEAZ, 4EGE 2MHR, 1JB3, 
2MCM, IIGD, 1C5E, and IJIG. 

In Step 2 of the flowchart, we used the routine in the TINKER package to add 
hydrogen atoms to the PDB coordinates. The force fields that we optimized are the 
AMBER parm94 version H, parm96 version fS), parm99 version Q, CHARMM 
version 22 1121 . and OPLS-AA ITSl . We have optimized only two sets of parame- 
ters. The first set is the partial-charge parameters (qj in Eqs. ^ and dZTli). In order to 
simplify the constraint-imposing processes on the total charge, we did not optimize 
the charge of one of the hydrogen atoms (HN) in proline when it is located at tht 
N-terminus. In the original X-ray data, hydrogen coordinates are missing, and in the 
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Fig. 9 3i()-helicity (a-1) and ;r-helicity (a-2) of C-peptide and 3i()-helicity (b-l) and ;r-helicity (b- 
2) of G -peptide as functions of the residue number at 300 K. These values were obtained from the 
REMD simulations. Normal and dotted curves stand for the optimized and the original AMBER 
ff03 force fields, respectivery. 



case of neutral histidine whether A^g and A^£ are protonated or not is non-trivial to de- 
termine. Because we want to deal with as many as PDB data as possible, we treated 
all the histidine residues as positively charged histidine for simplicity. Among the 
five force fields, AMBER has the largest number of remaining partial-charge pa- 
rameters (602). We thus optimized these 602 parameters for all the five force fields. 
The second set of parameters that we optimized is the backbone torsion-energy pa- 
rameters (Va, Vj,, and Vc in Eq. ( l30t ) and there are six such parameters (three each 
for and i//). 

As explained in detail above, the coodinates of the 100 proteins molecules have 
been prepared (Steps 1 and 2 of the flowchart in Fig. The coordinate refine- 
ment in Step 3 of the flowchart was then carried out with the constraint in Eq. ( [29] l 
on the heavy atoms. As for the force constant in Eq. (|29] l. we have some free- 
dom for the choice of the values. Our choice is: should be of the same order 
as Ki in the bond-stretching term in Eq. Q. The force constant Ki in AMBER 
varies from 166 kcal/mol/A^ to 656 kcal/mol/A^, and that in CHARMM varies 
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Fig. 10 a-lielicity (a-1) and /3-strandness (a-2) of C-peptide and a-helicity (b-1) and j8-strandness 
(b-2) of G-peptide as functions of temperature. These values were obtained from the REMD sim- 
ulations. Normal and dotted curves stand for the optimized and the original AMBER ff03 force 
fields, respectivery. 



from 173 kcal/mol/A^ to 650 kcal/mol/A^. Hence, in our first trial we set = 
lOOkcal/mol/A^. 

In Step 4 of the flowchart, we performed the optimization of the 602 partial- 
charge parameters by MC simulated annealing. Namely, we minimized F in Eq. ( l23T l 
by MC simulated annealing simulations of these parameters (the parameters were 
updated and the updates were accepted or rejected according to the Metropolis cri- 
terion). For this we introduced an effective "temperature" for the parameter space. 
The simulation run consisted of 50,000 MC sweeps with the temperature decreased 
exponentially from 20 to 0.01. The simulation was repeated 10 times with differ- 
ent initial random numbers. The time series of F from these simulations are shown 
in Figs. [TTTa)- [Tn e). We see that F decreases quickly in the beginning until about 
5,000 MC sweeps and then it decreases very slowly for all force fields; the total 
number of MC sweeps (50,000) seems sufficient. The optimized partial charges are 
taken from those that resulted in the lowest F value. 

In Tables |7]-[9] five examples (glycine, alanine, and glutamic acid) of the obtained 
partial charges together with the original force-field values are listed. We see from 
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Fig. 11 Time series of MC simulated annealing simulations in force-field parameter space of par- 
tial charges for AMBER parm94 (a), AMBER parm96 (b), AMBER pan-n99 (c), CHARMM ver- 
sion 22 (d), and OPLS-AA (e). The ordinate is the value of F in Eq. i23l . 



these tables that the values of the partial charges have not changed a lot. Although 
the sign of the partial charges remains the same for those with large magnitude, 
charges with small magnitude sometimes change their signs (see, for example, CA 
of glycine and CG of glutamic acid). 

In Step 5 of the flowchart, the original coordinates obtained in Step 2 were again 
refined with the constraints in Eq. (|29] |. but this time the optimized parameters from 
Step 4 were used. This time we used the value ~ 500 kcal/mol/A^. For all force 
fields, the average RMSD of the 100 proteins is 0.012 A, and the coordinates of 
heavy atoms have little changed. 

In Step 6 of the flowchart, we carried out the optimization of the six torsion- 
energy parameters (Va, V/,, and Vc in Eq. (l30l l for both (p and y/) by minimizing F 
in Eq. ( |23] ) with MC simulated annealing simulations in this parameter space. The 
simulation run consisted of 10,000 MC sweeps with the temperature decreasing 
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Table 7 Partial-charge parameters of glycine. AMB, CHA, and OPLS respectively stand for the 
original AMBER, CHARMM version 22, and OPLS-AA force fields. Opt(94), Opt(96), Opt(99), 
Opt(CH), and Opt(OP) are the optimized AMBER parm94, AMBER parm96, AMBER parm99, 
CHARMM version 22, and OPLS-AA, respectively. 



Atom 


AMB 


Opt(94) 


Opt(96) 


Opt(99) 


CHA Opt(CH) 


OPLS Opt(OP) 


N 


-0.4157 


-0.3471 


-0.3614 


-0.3506 


-0.4700 


-0.4381 


-0.5000 


-0.5153 


CA 


-0.0252 


0.0175 


0.0148 


0.0166 


-0.0200 


0.0185 


0.0800 


0.0909 


C 


0.5973 


0.5526 


0.5698 


0.5577 


0.5100 


0.5309 


0.5000 


0.6459 


HN 


0.2719 


0.2492 


0.2509 


0.2480 


0.3100 


0.3004 


0.3000 


0.2615 





-0.5679 


-0.5980 


-0.5977 


-0.5983 


-0.5100 


-0.5491 


-0.5000 


-0.5546 


HA 


0.0698 


0.0629 


0.0618 


0.0633 


0.0900 


0.0687 


0.0600 


0.0358 


Total 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 



Table 8 Partial-charge parameters of alanine. See the caption in Table|7] 



Atom 


AMB 


Opt(94) 


Opt(96) 


Opt(99) 


CHA Opt(CH) 


OPLS Opt(OP) 


N 


-0.4157 


-0.3354 


-0.3483 


-0.3407 


-0.4700 


-0.3909 


-0.5000 


-0.5224 


CA 


0.0337 


0.0545 


0.0547 


0.0511 


0.0700 


0.0427 


0.1400 


0.1301 


C 


0.5973 


0.5141 


0.5240 


0.5235 


0.5100 


0.5215 


0.5000 


0.6687 


HN 


0.2719 


0.2323 


0.2346 


0.2317 


0.3100 


0.2709 


0.3000 


0.2610 





-0.5679 


-0.5703 


-0.5599 


-0.5778 


-0.5100 


-0.5417 


-0.5000 


-0.5567 


HA 


0.0823 


0.0901 


0.0912 


0.0900 


0.0900 


0.0741 


0.0600 


0.0786 


CB 


-0.1825 


-0.0453 


-0.0470 


-0.0501 


-0.2700 


-0.2718 


-0.1800 


-0.0701 


HB 


0.0603 


0.0200 


0.0169 


0.0241 


0.0900 


0.0984 


0.0600 


0.0036 


Total 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 



Table 9 Partial-charge parameters of glutamic acid. See the caption in Table [7] 



Atom 


AMB 


Opt(94) 


Opt(96) 


Opt(99) 


CHA Opt(CH) 


OPLS Opt(OP) 


N 


-0.5163 


-0.4248 


-0.4376 


-0.4302 


-0.4700 


-0.3961 


-0.5000 


-0.5401 


CA 


0.0397 


0.0583 


0.0553 


0.0554 


0.0700 


0.0423 


0.1400 


0.1320 


C 


0.5366 


0.4728 


0.4873 


0.4817 


0.5100 


0.5249 


0.5000 


0.6538 


HN 


0.2936 


0.2595 


0.2620 


0.2590 


0.3100 


0.2845 


0.3000 


0.2626 


O 


-0.5819 


-0.6181 


-0.6107 


-0.6248 


-0.5100 


-0.5603 


-0.5000 


-0.5777 


HA 


0.1105 


0.1232 


0.1232 


0.1221 


0.0900 


0.0837 


0.0600 


0.0670 


CB 


0.0560 


0.1226 


0.1170 


0.1217 


-0.1800 


-0.1634 


-0.1200 


-0.0517 


HB 


-0.0173 


-0.0333 


-0.0334 


-0.0300 


0.0900 


0.0943 


0.0600 


0.0418 


CG 


0.0136 


-0.0678 


-0.0716 


-0.0659 


-0.2800 


-0.2870 


-0.2200 


-0.2185 


HG 


-0.0425 


-0.0300 


-0.0297 


-0.0299 


0.0900 


0.1160 


0.0600 


0.0437 


CD 


0.8054 


0.8293 


0.8340 


0.8292 


0.6200 


0.5465 


0.7000 


0.7320 


OE 


-0.8188 


-0.8142 


-0.8163 


-0.8142 


-0.7600 


-0.7479 


-0.8000 


-0.8152 


Total 


-1.0000 


-1.0000 


-1.0000 


-1.0000 


-1.0000 


-1.0000 


-1.0000 


-1.0000 
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from 1,000 to 1.0. The simulation was repeated six times with different random 
numbers. We stopped after six trials because the convergence was very good. The 
optimized torsion-energy parameters are taken from those that resulted in the lowest 
F value. The obtained torsion-energy parameters are Usted in Tables [TOl and [TTI 

Table 10 Torsion parameters of <|) angle. Parm94, Parm96, Parra99, CHARMM, and OPLS are 
AMBER parm94, AMBER parm96, AMBER parm99, CHARMM version 22, and OPLS-AA force 
fields, respectively. "Optimized" stands for the corresponding optimized force field. 



Force field 


V„ Ha 


7a 


Vb 


ni, 


Jb 








Parm94 


0.200 2 


180.0 














Optimized 


0.191 1 


0.0 


0.146 


2 


180.0 


-0.223 


3 


0.0 


Parm96 


0.850 1 


0.0 


0.300 


2 


180.0 








Optimized 


1.182 1 


0.0 


0.359 


2 


180.0 


-0.410 


3 


0.0 


Parm99 


0.800 1 


0.0 


0.850 


2 


180.0 








Optimized 


1.380 1 


0.0 


0.599 


2 


180.0 


-0.330 


3 


0.0 


CHARMM 


0.200 1 


180.0 














Optimized 


-0.047 1 


180.0 


0.240 


2 


180.0 


-0.015 


3 


0.0 


OPLS 


-2.365 1 


0.0 


0.912 


2 


180.0 


-0.850 


3 


0.0 


Optimized 


0.502 1 


0.0 


1.811 


2 


180.0 


-0.567 


3 


0.0 



Table 11 Torsion parameters of 1/ angle. See the caption in Table 1 101 



Force field 


V„ ;Ja 


r« 


Vt 


nt 


Jb 








Parm94 


0.750 1 


180.0 


1.350 


2 


180.0 


0.400 


4 


180.0 


Optimized 


-0.368 1 


180.0 


1.658 


2 


180.0 


0.265 


4 


180.0 


Parm96 


0.850 1 


0.0 


0.300 


2 


180.0 








Optimized 


0.039 1 


0.0 


1.011 


2 


180.0 


0.104 


3 


0.0 


Parm99 


1.700 1 


180.0 


2.000 


2 


180.0 








Optimized 


0.228 1 


180.0 


1.684 


2 


180.0 


-0.031 


3 


0.0 


CHARMM 


0.600 1 


0.0 














Optimized 


0.321 1 


0.0 


0.028 


2 


180.0 


0.251 


3 


0.0 


OPLS 


1.816 1 


0.0 


1.222 


2 


180.0 


1.581 


3 


0.0 


Optimized 


0.880 1 


0.0 


1.479 


2 


180.0 


0.952 


3 


0.0 



In the present work, we stopped our process in Step 6 of the flowchart and did 
not iterate the optimizations. 

In order to examine how much the torsion-energy terms have changed after op- 
timizations, we depict them in Fig. [12] (we remark that the error of factor 2 in the 
ordinate of Fig. 5 (el) in Ref. |!26l is corrected here). Although the behaviors of the 
original force fields are quite different, those of the optimized force fields are rather 
similar. For example, the optimized torsion-energy curves for ^ angles have two 
maximum peaks around (/) ~ —60° and +60° and a local minimum at (/) = 0°, while 
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those for y/ angle have two peaks around xj/ ^ —100° and +100° and a local mini- 
mum at i/A = 0° (the exceptions are those for CHARMM version 22 and OPLS-AA, 
which give the global maximum and a local maximum, respectively, at i/A = 0°). 
These results suggest that our optimizations of the torsion-energy term yield a ten- 
dency for convergence towards a common function. Some remark is in order. The 
case for the optimized CHARMM is the most distinct from other optimized param- 
eters in the sense that it gives the global maximum as = 0° whereas that for other 
cases lie around \j/ —100° and +100°. 

In Fig. [13] the potential-energy surfaces of the alanine dipeptide (ACE-ALA- 
NME) are shown for the 10 force-field parameters: the original AMBER parm94, 
AMBER parm96, AMBER parm99, CHARMM version 22, OPLS-AA, and the 
corresponding optimized parameters. According to the ab initio quantum me- 
chanical calculations, there exist three local-minimum states in the energy sur- 
face Q. They are conformers C?^^, C5, and Cj^x, which correspond to ~ 
(-80°, +80°), (-160°, +160°), and (+75°, -60°), respectively (Cy^, is the global- 
minimum state). We remark that these are the results of quantum chemistry cal- 
culations in vacuum, and so it is not clear how reliable the results are to rep- 
resent the dipeptide in aqueous solution. The results of all five original force 
fields in Figs. [T3l al)-[T3lel) seem to satisfy the above conditions. Namely, there 
are three local-minimum states at the locations of Cjeq, C5, and Cjax, and the 
global-minimum state is Cjeq- As for the results of the optimized force fields in 
Figs.[T3la2)-[T3le2), those for CHARMM version 22 and OPLS-AA also satisfy the 
above conditions. Those of the optimized AMBER force fields are less consistent 
with the quantum mechanical calculations: Cjeq is no longer the global-minimum 
state, but it is a local-minimum state. In particular, the optimized AMBER parm99 
seems to be in the greatest disagreement in the sense that the Cj^g state is almost 
disappearing. 

We now present another example of the refinement of our backbone torsion en- 
ergy in Eq. (fT4l) . We consider the following truncated Fourier series: 



This function has 13 Fourier-coefficient parameters. We will see below that this 
number of Fourier terms is sufficient for the most of our purposes ll34l[35l . but that 
for some cases more number of Fourier terms are preferred. 

We optimize the force-field parameters of this double Fourier series by using 
our optimization method. At first, we chose 100 PDB files with resolution 2.0 A or 
better, with sequence similarity of amino acid 30.0 % or lower and with less than 
200 residues (the average number of residues is 117.0) from PDB-REPRDB f55]. 
Generally, data from X-ray experiments do not have hydrogen atoms. Therefore, 
we have to add hydrogen coordinates. Many protein simulation software packages 



a + h\ cos(j) +ci sin^ +b2Cos2(j) +C2sin2^ 
+ di cosv/ + eisinv/ + (i2Cos2i// + e2sin2i// 
+ /ncos0cosi^ + §iicos0sin\// 
+ /ill sin0 cos \// + /ii sin sin I// . 



(41) 
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Fig. 12 Backbone torsion-energy curves as functions of (j) (in degrees) and \f/ (in degrees). The 
force fields are AMBER parm94 (a), AMBER parm96 (b), AMBER pan-n99 (c), CHARMM ver- 
sion 22 (d), and OPLS-AA (e). The results for the original force fields are represented by dotted 
curves, and those for the optimized force fields are by solid curves. 





Fig. 13 Potential-energy surfaces of alanine dipeptide. The force fields are the original AMBER 
parm94 (al), AMBER parm96 (bl), AMBER parm99 (cl), CHARMM version 22 (dl), and OPLS- 
AA (el), and the corresponding optimized parameters (a2)-(e2). The contour maps were evaluated 
every 10° of <j) and i// angles and plotted every 1 kcal/mol, after minimizing the total potential 
energy in vacuum with the backbone structures fixed. The bluer the color is, the lower the potential 
energy surface is. As the potential-energy value increases, the color changes from blue to green, to 
yellow, and to red. 
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provide with routines that add hydrogen atoms to the PDB coordinates. We used the 
TINKER program package 153)1 . 

In our optimization method, the minimizations of F in Eq. ( [23] l by the Monte 
Carlo (MC) simulations of the 13 backbone-torsion-energy parameters with 3000 
MC steps were performed. The initial values of 13 parameters were all set to be 
zero. We performed MC simulations of the optimization for each /cut value 10 times 
with different seeds for the random numbers. After that, the minimum F value was 
selected from the results of the obtained 10 parameter sets for each case of the /cut 
value. The overall parameter distributions were essentially the same for the 10 runs. 
The maximum /cut value was taken to be /cuf — 9.0, which was selected from the 
peak point in the distribution of the forces acting on each atom in the 100 protein 
structures in Fig. [14] For the obtained several parameters, several 0RMSD were 
calculated by using Eq. (l32l ). Here, if a difference between 0^^^"^ and 0™'" of 
a backbone dihedral angle in a protein was more than 20 degrees, the value was 
ignored. Because there are about 90% of differences between ^P""™ and ^>™'" in- 
cluding less than 20 degrees. In Fig. [15] the distribution of the backbone dihedral 
angles in the 100 protein structures is shown. Namely, we wanted to consider the 
majority of the differences of backbone dihedral angles. After the calculations of 
several 0RMSD, we select /cut = 8.5 at the minimum value of ^>RMSD from the 
several those. 
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Fig. 14 The distribution of the absolute value of the forces acting on each atom in the 1 00 protein 
structures, which were obtained from PDB. 



In Table. [12] optimized double Fourier-coefficient parameters and the corre- 
sponding original AMBER ff94 and ff96 force-field parameters are listed. Here, 
the original AMBER ff94 has a Fourier coefficient that the number of waves is four 
Therefore, this coefficient set of the original AMBER ff94 is not complete. Addi- 
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Fig. 15 Tiie distribution of the absolute value of the backbone dihedral angles <P {(j) and y/ ) in the 
100 protein structures, which were obtained from PDB. 



tionally, in Fig. [16] these backbone-torsion-energy surfaces on the Ramachandran 
space are illustrated. 



Table 12 Fourier coefficients in Eq. {391 obtained from the numerical evaluations of the integrals 
in Eq. fTSt . "org94" and "org96" stand for the original AMBER ff94 and the original AMBER 
ff96, respectively, "optimized" stands for the optimized force tield obtained by our optimization 
method. Here, the original AMBER ff94 has the Fourier coefficient that the number of waves is 
four. Therefore, this coefficient set of the original AMBER ff94 is not complete. 



coefficient 


org94 


org96 optimized 


a 


2.700 


2.300 


0.000 


h 


0.000 


0.850 


0.835 


b2 


-0.200 


-0.300 


-0.088 


C\ 


0.000 


0.000 


-0.327 


C2 


0.000 


0.000 


0.100 


di 


-0.750 


0.850 


0.287 


d2 


-1.350 


-0.300 


0.019 


e\ 


0.000 


0.000 


-0.160 




0.000 


0.000 


-0.054 


fn 


0.000 


0.000 


-0.427 


gn 


0.000 


0.000 


0.247 


/ill 


0.000 


0.000 


0.114 


'11 


0.000 


0.000 


0.603 



In order to test the validity of the force-field parameters obtained by our opti- 
mization methods, we performed folding simulations using two peptides, namely, 
C-peptide and G-peptide. 
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Fig. 16 The backbone-torsion-energy surfaces of the optimized force field (a), the original AM- 
BER ff94 (b), and the original AMBER ff96 are shown. 



For the folding simulations, we used the replica-exchange molecular dynamics 
(REMD) method 1591. We used the TINKER program package f53\ modified by us 
for the folding simulations. The unit time step was set to 1 .0 fs. Each simulation was 
carried out for 5.0 ns (hence, it consisted of 5,000,000 MD steps) with 32 replicas. 
The temperature during MD simulations was controlled by Nose-Hoover method 
ll62l . For each replica the temperature was distributed exponentially from 700 K to 
250 K. As for solvent effects, we used the GB/SA model 141] |42| included in the 
TINKER program package l53l . 

We checked the secondary-structure formations, such as the helicity and the 
strandness, by using the DSSP program ll44l . which is based on the formations of 
the intra-backbone hydrogen bonds. Strandness means that there are )3 -bridge or 
extended strand in the corresponding amino acid. In Fig.[T7l the helicity and strand- 
ness of C-peptide which were obtained with the optimized force field, the original 
AMBER ff94 and ff96 are shown. In comparison with the helicity of the original 
AMBER ff94, the helicity of the optimized force field decreases and in comparison 
with that of the original AMBER ff96, that of the optimized force field increases. 
For the strandness, the original AMBER ff94 is almost zero, and both the optimized 
force field and the original AMBER ff96 have the low strandness. 

In Fig. [18] the helicity and strandness of G-peptide which were obtained with the 
optimized force field, the original AMBER ff94 and ff96 are shown. The helicity 
of the original AMBER ff94 obviously has high value the same as the case of C- 
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Fig. 17 Helicity (a) and strandness (b) of C-peptide as functions of tlie residue number Tliese 
values are obtained from the REMD 1591 simulations at 300K. Normal, dashed, and dotted lines 
stand for the optimized force field, the original AMBER ff94, and the original AMBER ff96, 
respectively. There is only one secondary structural element (an a-helix in residues 4 to 12) in the 
native structure (PDB ID: 1A5P). See Fig.[5la). 



peptide. On the other hand, the helicity of both the optimized force field and the 
original AMBER ff96 decrease in comparison with the case of the original AMBER 
ff94. However, in comarison with the original AMBER ff96, the optimized force 
field slightly favors the helix structure in the region around amino-acid residues 
6-8. In the experimental results, there is a turn region around residues 7-10 in G- 
peptide, and the backbone-torsion angles of the turn conformation are similar to that 
of the helix structure. Therefore, we consider that this tendency is not disagreement 
with the experimental results. For the strandness, the original AMBER ff94 is also 
almost zero the same as the case of C-peptide, and both the optimized force field 
and the original AMBER ff96 have higher values of the strandness than those ot the 
helicity. In Fig. fTSl b), the strandness decreases in the region around 7-8 residues in 
agreement with the experiments. 

These secondary-structure-forming tendencies of the optimized force field for 
two peptides agree with experimental implications in comparison with those of the 
original AMBER ff94 and ff96 force field. Therefore, our improvement methods 
succeeded in enhancing the accuracy of the AMBER force field. 

3.2.2 Use ofRMSDlOSl 

We now present the results of the applications of our optimization method in Sub- 
section 2.3.2, which we refer to as Method 2, as well as that in Subsection 2.3.1, 
which we refer to as Method 1 . 

At first, we chose 100 PDB files with resolution 2.0 A or better, with sequence 
similarity of amino acid 30.0 % or lower and with less than 200 residues (the aver- 
age number of residues is 117.0) from PDB-REPRDB ll55l . Next, we refine these 
selected 100 structures. Generally, data from X-ray experiments do not have hydro- 
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Fig. 18 Helicity (a) and strandness (b) of G-peptide as functions of the residue number. These 
values are obtained from the REMD 1591 simulations at 300K. Normal, dashed, and dotted lines 
stand for the optimized force field, the original AMBER ff94, and the original AMBER ff96, 
respectively. There is only one secondary structural element (a j8 -hairpin; j8 -strands are in residues 
2 to 6 and residues 11 to 15) in the native structure (PDB ID: IPGA). See Fig.[5lb). 



gen atoms. Therefore, we have to add hydrogen coordinates. Many protein simula- 
tion software packages provide with routines that add hydrogen atoms to the PDB 
coordinates. We used the TINKER program package ||53l . We thus minimize the 
total potential energy fitotai = £^conf + ^soiv + £^constr with respect to the coordinates 
for each proten conformation, where ficonstr is the constraint energy term in Eq. ( |29b . 
Here, Kx is the force constant of the restriction and xq are the original coordinate 
vectors of heavy atoms in PDB. As one can see from Eq. (|29] |. the coordinates of 
hydrogen atoms will be mainly adjusted, but unnatural heavy-atom coordinates will 
also be modified. We performed this minimization for all the 100 protein structures 
separately and obtained 100 refined structures. 

We focused on the parameters of torsion-energy term, which we believe to be an 
important force-field term that influences the backbone conformational preferences 
such as a-helix structure and j3 -sheet structure. For example, AMBER parm94 
Q and AMBER parm96 JS) have very different behaviors about the secondary- 
structure-forming tendencies, although these force fields differ only in the backbone 
torsion-energy terms for rotations of the backbone and angles. Recently, new 
force-field parameters of the backbone torsion-energy term about and angles 
have been developed, which are, e.g., AMBER ff99SB IlOl, AMBER ff03 O, and 
CHARMM 22/CMAP ifTsl . 

The force field that we optimized is the OPLS-UA 1631 . The torsion-energy term 
£^torsion(^) for this force field is given by Eq. (|5]l. We performed the force-field 
parameter optimizations that correspond to the following torsion angles by Methods 
1 and/or 2. 



1. N-Ca-Cfi-Cy and C-Ca-Cp-Cy (^i) by Method 2 
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2. C-N-Ca-C ((j)), N-Ca-C-N (v/), C-N-Ca-C|3 and N-C-Ca-C^ by Methods 1 
and 2 

3. C-N-Ca-Cp by Method 2 

4. N-Ca-C-N by Method 2 

5. Ca-Cp-Cy-Cs (Xi of Glu) by Methods 1 and 2 

Here, we also optimized the force-field parameters of Xi of Glu. The reason is 
given below. 

In Method 1, the minimizations of F in Eq. i23[ by the Monte Carlo (MC) simu- 
lated annealing simulations of the torsion-energy parameters with 10000 MC steps 
were performed 10 times. Here, we neglected the improper- torsion-energy contribu- 
tions to /iconf in Eq. (l25l l. In order to make a better force field, we have to optimize 
many force-field parameters. However, we ignored the uncertainty of improper- 
torsion-energy parameters with this optimization, because we wanted to focus on 
the torsion-energy parameters and Method 1 is very sensitive for the energy of di- 
hedral angles. For example, one of the results of the simulations of Method 1 above 
is shown in Fig. [19] 
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Fig. 19 Time series of Monte Carlo simulated annealing simulations in force-field parameter space 
of torsion-energy for OPLS-UA. The ordinate is the value of F in Eq. i23\ . 



In Method 2, the lowest R value was selected from about 10-30 optimization 
runs with different initial conditions. In order to calculate R, the minimizations of 
100 proteins were performed using these new parameter sets. In Table [T3l all the 
optimized torsion-energy parameters are listed. As one can see in Table [13] the 
original parameters of OPLS-UA force field for the optimization are almost zero. 
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Table 13 Original and optimized torsion-energy parameters of OPLS-UA. 



Vi/2 71 V2/2 ji Vj/l ji 



org opt org opt org opt 



N-C„-C,3-C^(;i;i) 0.5 or 1.0 1.950 0.0 

C-Ca-Cfi-Cy ixx) 0.5 or 1.0 1.950 0.0 

C-N-Ca-C (<l>) 0.0 -0.662 0.0 0.0 0.277 n 0.0 -0.050 0.0 

N-Ca-C-N (V/) 0.0 0.974 0.0 0.0 0.576 n 0.0 -0.083 0.0 

C-N-C„-Cp 0.0 0.811 0.0 0.0 0.328 ;t 0.0 0.155 0.0 

N-C-Ca-C/3 0.0 0.215 0.0 0.0 0.036 n 0.0 0.015 0.0 

Cct-Cp-Cy-Cg of Glu) 0.0 0.565 0.0 0.0 0.177 ;t 2.0 -0.025 0.0 



In comparison with Method 1, Method 2 can optimize force-field parameters 
appropriately even if there are some errors in PDB structures. However, the com- 
putational cost of Method 2 is much larger than that of Method 1 . Therefore, we 
could not apply Method 2 to the global optimization in the force-field-parameter 
space. The force-field parameters of the backbone-torsion angles need the global 
optimization, because we consider that these parameters are the most problematic. 
Thus, at first, we performed the global optimization of the backbone-torsion param- 
eters by using Method 1 . After that. Method 2 was applied only on the local region 
of the parameter space, which was identified as relevant by Method 1 . 

In order to test the validity of the force-field parameters obtained by our opti- 
mization methods, we performed folding simulations using two peptides, namely, 
C-peptide and G-peptide. 

Only Glu amino acid appears twice in each of the two peptides. Therefore, we 
consider that Glu amino acid is the most important, and the Xi parameters were op- 
timized for this amino acid. (Of cource, we expect that it becomes a better force field 
if the remaining force-field parameters of other amino acids are also optimized.) 

For the folding simulations, we used the replica-exchange molecular dynamics 
(REMD) method 1591 . REMD is one of the generalized-ensemble simulation algo- 
rithms and has high conformational sampling efficiency by allowing configurations 
to heat up and cool down while maintaining proper Boltzmann distributions. We 
used the TINKER program package 1531 modified by us for the folding simulations. 
The unit time step was set to 1 .0 fs. Each simulation was carried out for 10 ns (hence, 
it consisted of 10,000,000 MD steps) with 16 replicas. The temperature during MD 
simulations was controlled by Nose-Hoover method i62;|. For each replica the tem- 
perature was distributed exponentially: 700, 662, 625, 591, 558, 528, 499, 471, 446, 
421, 398, 376, 355, 336, 317, and 300 K. As for solvent effects, we used the GB/SA 
model l4ni42l included in the TINKER program package f53l|. These folding sim- 
ulations were repeated 10 times with different sets of randomly generated initial 
velocities. 

In Fig. |20] the helicity and strandness of C-peptide which were obtained with 
the original OPLS-UA and its optimized force field are shown. These values are the 
averages of the 10 REMD simulations at 300 K. In comparison with the helicity 
of the original OPLS-UA, the helicity of the optimized force field increases at the 
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amino-acid sequence between 6 and 12. For the strandness, both the original and 
optimized OPLS-UA force fields are almost zero. 




Fig. 20 Helicity (a) and strandness (b) of C-peptide as functions of the residue number These 
values are the average of the 10 independent REMD j59 1 simulations at 300 K. Normal and dotted 
lines stand for the optimized and original OPLS-UA force fields, respectively. 



In Fig.lJT] the helicity and strandness of G-peptide at the original OPLS-UA and 
its optimized force field are shown. In comparison with the helicity of the original 
OPLS-UA, the helicity of the optimized force field decreases at the area of amino- 
acid sequence between 8 and 15, and in comparison with the strandness of the origi- 
nal OPLS-UA, the strandness of the optimized force field clearly increases at the two 
areas of amino-acid sequences 2-6 and 9-15. We checked the secondary-structure 
formations by using the DSSP program 1441 . which is based on the formations of 
the intra-backbone hydrogen bonds. Strandness means that there are )3 -bridge or 
extended strand in the corresponding amino acid. In the experimental results, there 
is a turn region around residues 7-10 and there are five intra-backbone hydrogen 
bond pairs, namely, between residue pairs 2-15, 3-14, 4-13, 5-12, and 6-1 1 in G- 
peptide. In Fig. l2Tl b). the strandness decreases in the region around 7-8 residues in 
agreement with the experiments. 

These results show that the optimized force field favors helix structures more 
than the original OPLS-UA in the case of C-peptide and favors j3 structures more 
than the original OPLS-UA in the case of G-peptide. We see that these secondary- 
structure-forming-tendencies of the optimized force field are better than those of the 
original OPLS-UA, because these results are consistent with the native structures of 
the two peptides. 

In Figs. I22]andl23l we show the 20 lowest-energy conformations of C-peptide 
and G-peptide obtained by the REMD simulations in the case of the original and 
optimized OPLS-UA force fields, respectively. In Fig. |22ta), five conformations 
(Nos. 11, 13, 16, 18, and 19) have a-helix structures for the original OPLS-UA 
in the case of C-peptide. In Fig. l22l b). 18 conformations (all conformations except 
for Nos. 2 and 12) have a-helix structures for the optimized OPLS-UA in the case 
of C-peptide. ^From these results, we can see that the optimized OPLS-UA force 



44 



Yoshitake Sakae and Yuko Okamoto 




Fig. 21 Helicity (a) and strandness (b) of G-peptide as functions of the residue number. These 
values are the average of the 10 independent REMD |591 simulations at 300 K. Normal and dotted 
lines stand for the optimized and original OPLS-UA force fields, respectively. 



field favor a -helix structure more than the original OPLS-UA force field in the case 
of C-peptide. In Fig. 1231 a). 1 1 conformations have a-helix structures for the orig- 
inal OPLS-UA in the case of G-peptide. In Fig. l23T b). seven conformations have 
a-helix structures, and eight conformations have /3 -hairpin structures for the opti- 
mized OPLS-UA in the case of G-peptide. In Fig. |23l b). two conformations (Nos. 
3 and 16) out of the eight j8 -hairpin conformations have the right hydrogen bond 
formations that are inferred by the experiments. Namely, conformation No. 3 has 
three native-like hydrogen bonds between residue pairs 3-14, 4-13, and 5-12, and 
conformation No. 16 has two native-like hydrogen bonds between residue pairs 3-14 
and 4-13. These results for G-peptide show that the optimized OPLS-UA force field 
does not favor a-helix structure and clearly favors j8 -hairpin structure more than the 
original OPLS-UA force field. 

These secondary-structure-forming tendencies of the optimized OPLS-UA force 
field for two peptides agree with experimental implications in comparison with 
those of the original OPLS-UA force field. Therefore, our optimization methods 
succeeded in enhancing the accuracy of the OPLS-UA force field. 



3.2.3 Useof RMSDIIOtI 

We now present the results of the applications of our new optimization method of 
force-field parameters. 

At first, we chose 100 PDB files with resolution 2.0 A or better, with sequence 
similarity of amino acid 30.0 % or lower, and with less than 200 residues (the aver- 
age number of residues is 122.2) from PDB-REPRDB 155)1 . We selected the number 
of each fold (all a, all j3, a/j3, and a + j3) in 100 proteins based on the number of 
folds given by SCOP (version 1.73 at November 2007) 1641 . Namely, we used 29 
all a, 18 all j3, 16 a/j3, and 37 (a + j3) proteins (the list is sHghdy different from 
that in Table IS. 
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(b) 

Fig. 22 Twenty lowest-energy conformations of C-peptide obtained from 10 sets of REMD 1591 
simulation runs, (a) and (b) are the results of the original and optimized OPLS-UA force field, 
respectively. The conformations are ordered in the increasing order of energy for each case. The 
figures were created with DS Visualizer vl.5 l51l . 



The force field that we optimized is the AMBER pann96 version IS]. The 
backbone-torsion-energy term EtorsionC^, 'f') for this force field is given by 

vf v7 v7 

^torsion f ) = ^ [1 +COS 0] + ^ [1 - C0S2^] + ^ [1 +COS V/] + ^ [1 - C0S2 VA] , 

(42) 

where we have — 1.7, Vj* = 0.6, = 1.7, and V2 — 0.6. Here, we have op- 
timized only two parameters in the backbone-torsion-energy term, namely, and 
V2 for V/ angle. As described above, AMBER parm94 and AMBER parm96 have 
quite different secondary-structure-forming-tendencies, although these force fields 
differ only in the backbone torsion-energy terms for rotations of the ^ and 1// angles. 
Moreover, we can easily imagine that force-field parameters and for 1// angle 





(b) 

Fig. 23 Twenty lowest-energy conformations of G-peptide obtained from 10 sets of REMD 1591 
simulation runs, (a) and (b) are the results of the original and optimized OPLS-UA force field, 
respectively. The conformations are ordered in the increasing order of energy for each case. The 
figures were created with DS Visualizer vl. 5 1511 . 



are important for the secondary-structure-forming-tendencies, because the energy 
surface in the Ramachandran space is quite sensitive to this energy term in the heHx 
and j3 -sheet regions. Namely, if the torsion-energy term for the xj/ angle changes, the 
stabilities of helix structure region and j3 -sheet region on the Ramachandran space 
change. Therefore, we considered some trial force-field parameters for V^'' and 
which are given by the following equations: 

y/"^' = 1.7-0.2/ = 0.34i, (43) 



yf^' =0.6-0.2/ = 0.12i. (44) 
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Here, / is any real number When ; is 5, the force-field parameters yt""' and Vj"'^' of 
y/ angle are equal to those of the original AMBER parm96. ^From our experience, if 
i has a small number (/ < 5), the force field favors helix structure, and if / has a large 
number (/ > 5), the force field favors j3 -sheet structure (see also Figs. I24landl25] 
below). We calculated 4>RMSD2ndiy values in Eq. ( |37] | about some trial force-field 
parameters obtained by changing / in Eqs. ( |43] | and (l44l i. 
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Fig. 24 Helicity (a) and strandness (b) of C-peptide as functions of tiie residue number Tliese 
values are the averages of the 10 independent REMD 1591 simulations at 300 K. Optimized, origi- 
nal, para3, and para7 stand for the optimized AMBER pann96 (/ = 4.7), original AMBER parm96 
(i = 5.0), trial force field paraS (i = 3.0), and trial force field para7 (/ = 7.0), respectively. 
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Residue number 



Fig. 25 Helicity (a) and strandness (b) of G-peptide as functions of the residue number. These 
values are the averages of the 10 REMD )59| simulations at 300 K. Optimized, original, para3, and 
para? stand for the optimized AMBER parm96 (; = 4.7), original AMBER parm96 (/ = 5.0), trial 
force field para3 (/ = 3.0), and trial force field para? (/ = 7.0), respectively. 



We performed the minimization, which was terminated when the root-mean- 
square (RMS) potential energy gradients were less than 0.1 (kcal/mol/A) by using 
TINKER program package 153)1 . For solvent effects, we used GB/SA solvent model 
in TINKER. 

The results of ^RMSDheiix and ^RMSDp are shown in Fig.lMa) and Fig.l26lb). 
recpectively. In these calculations, if the differences of the backbone-dihedral an- 
gles between cf>"^"^'' and ^>!™" in Eq. ( [36] l are more than 30 degrees, they were 
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ignored, assuming that the uncertaintties in those angles are too large. We see that 
0RMSDheiix decreases gradually with a decrease in If / decreases, the torsion 
energy of the helix structure region in the Ramachandran space also decreases. On 
the other hand, <J> RMSD|3 decreases gradually with an increase in ;. If ; increases, 
the torsion energy of the j3 structure region in the Ramachandran space decreases. 
Hence, this result is reasonable. However, c?>RMSDp reaches the global minim- 
ium, when / is 6.5. If / is larger than 6.5, ^>RMSD|3 increases gradually. This result 
implies that the ^>RMSDp does not correspond to the parameters Vj""' and V2"^' 
completely. 
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Fig. 26 Distributions of "JRMSDheiix (a), "JRMSDp (b), and <l>RMSD2ndly (c) 
obtained from tlie minimization of 100 proteins using the trial force-field parameters l/j'""' and 
yjrial depending on the number /. 
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For 0RMSDheiix and ^RMSDp in Fig.|26](a) and (b), we can see the differ- 
ence clearly. The noteworthy point obtaind from these results is that 4>RMSD can 
distinguish between helix structure and j3 structure. 

We combined ^iRMSDheiix and ^RMSD^ by Eq. (O. Here, in order to have 
roughly equal contributions from both terms, we can set the value of the scaling 
factor A to be, for example, the coefficients of variations: 



Here, /iheiix and ^ip are the averages and Ohdix and ap are the corresponding standard 
deviations for ^>RMSDheiix and ^>RMSDp . For the calculations, we have chosen a 
small number of / values in a range i'min ^ ' ^ 'max- For i'min = and 

'max — 10, we 

obtained A = 6.857, and this fixied value was used for all the calculations in the 
present work. 

In Fig. [26l c). the combined result is shown. The smallest ^'RMSD2ndiy is ob- 
tained value ; = 4.7, namely, the obtained force-field parameters are Vj'"^' = 1.598 
and Vj""' = 0.564. These values are slightly smaller than those of the original AM- 
BER parm96, which corresponds to / = 5. We can easily expect the new obtained 
force-field parameters slightly favor helix structure more and j3 -sheet structure less 
than the original AMBER parm96. 

In order to check the force-field parameters obtained by our optimization method, 
we performed the folding simulations using two peptides, namely, C-peptide and G- 
peptide. 

For the folding simulations, we used replica-exchange molecular dynamics (REMD) 
||59ll . We used the TINKER program package ll53l modified by us for the folding 
simulations. The unit time step was set to 1.0 fs. Each simulation was carried out 
for 2 ns (hence, it consisted of 2,000,000 MD steps) with 16 replicas and repeated 
10 times. The temperature during MD simulations was controlled by Berendsen's 
method 1521 . For each replica the temperature was distributed exponentially: 700, 
662, 625, 591, 558, 528, 499, 471, 446, 421, 398, 376, 355, 336, 317, and 300 K. As 
for solvent effects, we used the GB/SA model |l4T]|42l included in the TINKER pro- 
gram package [[53|| . These folding simulations were performed with different sets of 
randomly generated initial velocities. 

In Fig.[24l the helicity and strandness of C-peptide which were obtained with the 
original AMBER parm96 and its optimized force field are shown. These values are 
the averages of the 10 REMD simulations at 300 K. In comparison with the helicity 
of the original AMBER parm96, the helicity of the optimized force field is similler 
However, the helicity of Thr3, Ala4, and Ala5 of the optimized force field slightly 
increases. In comparison with the strandness of the original AMBER parm96, the 
strandness of the optimized force field decreases except for those at Ala6, Lys7, and 
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In Fig. |25] the helicity and strandness of G-peptide at the original AMBER 
parm96 and its optimized force field are shown. In comparison with the helicity 
of the original AMBER parm96, the helicity of the optimized force field slightly 
increases, and in comparison with the strandness of the original AMBER parm96, 
the strandness of the optimized force field slightly decreases. For trial force fields of 
para3 and para7, the scondary-structure-forming-tendencies are simillar to the case 
of C-peptide. 

These results clearly show that the optimized force field favors helix structures 
and does not favor j3 structures than the original AMBER parm96. We can see that 
these secondary-structure-forming-tendencies of the optimized force field are better 
than those of the original AMBER parm96, becasue it is known that the AMBER 
parm96 slightly favors the j8 structure too much Il23ll24ll25ll26ll27l . 

We also performed the folding simulations with two extreme cases of the trial 
force fields, namely, para3 (; = 3.0) and para7 (; = 7.0) (see Figs. l24l and IZSTi for 
comparisons. The trial force field para3 favors helix structure strongly and does 
not favors )3 structure clearly. On the other hand, the trial force field para7 has the 
tendency that is quite reverse to para3. According to the results of cf>RMSDheiix 
and cf>RMSD|3 in Fig. |26ta)(b), 4>RMSDheiix decreases gradually with a decrease 
in /, and ^>RMSDj3 reaches the global minimum, when / is 6.5. Namely, we can see 
that the values of ^RMSDheiix and c?>RMSDj3 are related to the stabilities of heHx 
structure and j5 structure well. 

3.2.4 Use of short MD simulations lH 

We present the results of the applications of our optimization method in Subsection 
2.3.4 to the AMBER ff99SB force field. At first, we chose 31 PDB files (M = 31) 
with resolution 2.0 A or better, with sequence similarity of amino acid 30.0 % or 
lower and with from 40 to 111 residues (the average number of residues is 86.7) 
from PDB-REPRDB iSS). Namely, the PDB IDs of these 31 proteins are ILDD, 
IHBK, 1Y02, 1I2T, 1U84, 2ERL, ITQG, 1082, 1V54, IXAK, IGMU, 105U, 
INLQ, IWHO, ICQY, 1H75, IGMX, IIIB, IVCl, 1AY7, IKAF IKPF 1BM8, 
IMKO, 1EW4, lOSD, IVCC, lOPD, ICYO, ICTF and 1N9L. Generally, data from 
X-ray experiments do not have hydrogen atoms. Therefore, we have to add hydro- 
gen coordinates. Many protein simulation software packages provide with routines 
that add hydrogen atoms to the PDB coordinates. After adding the hydrogen atoms, 
we performed the short potential energy minimizations while restraining the heavy 
atoms. We use the obtained conformations as the initial structures (experimental 
structures). We performed MD simulations for these proteins. Each simulation was 
carried out for 40.0 ps (hence, it consisted of 20,000 MD steps, and the unit time 
step was set to 2.0 fs and the bonds involving hydrogen were constrained by SHAKE 
algorithm |60l ) by using Langevin dynamics at 300 K. The nonbonded cutoff of 20 
A were used. As for solvent effects, we used the GB/SA model li57l included in 
the AMBER program package (igb = 5). These simulations were performed with 
different sets of the same generated initial velocities of atoms in 31 proteins. For 
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all the process, we used the AMBERll program package 11561 . As trial force-field 
parameters, we used the parameters Vi of Xjf (N-Ca-C-N) and Xj/ (Cjg-Ca-C-N) an- 
gles for torsion-energy term in Eq. (|5]l. We performed the simulations by using 14 
and 15 values of the Vi parameters of y/ and yr', respectively, and these simulations 
with each set of parameter values were performed five times by changing the initial 
velocities of atoms in the 31 proteins. Namely, we calculated nf^^ and nf^^ in 
Eq. ( |38] | as the average numbers of nf^^ and nf^^ of 10 trajectories from 20.0 ps 
to 40.0 ps of the five simulations. These results are shown in Fig.|27] We determined 
the optimized force-field parameters in order of y/' and xj/, by searching the mini- 
mum value of S in Fig. [27] Vi parameter for y/ changed from 0.45 to 0.31, and Vi 
parameter for y/' changed from 0.20 to — 1 .60. 
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Fig. 27 5 values (defined in Eq. |38j) obtained from MD simulation.s of 31 proteins with the force 
fields which have different V\ parameter values for y/' (Cp-Ca-C-N) (a) and y/ (N-Ca-C-N) (b) 
angles. 



In order to test the validity of the force-field parameters obtained by our opti- 
mization method, we performed the folding simulations using two peptides, namely, 
C-peptide and G-peptide. 

For test simulations, we used replica-exchange molecular dynamics (REMD) 
ll59ll . We used the AMBERll program package |f56l. The unit time step was set 
to 2.0 fs, and the bonds involving hydrogen were constrained by SHAKE algorithm 
ll60ll . Each simulation was carried out for 30.0 ns (hence, it consisted of 15,000,000 
MD steps) with 32 replicas by using Langevin dynamics. The replica exchange was 
tried every 3,000 steps. The temperature was distributed exponentially; 600, 585, 
571, 557, 544, 530, 517, 505, 492, 480, 469, 457, 446, 435, 425, 414, 404, 394, 
385, 375, 366, 357, 348, 340, 332, 324, 316, 308, 300, 293, 286, and 279 K. As 
for solvent effects, we used the GB/SA model 11571 included in the AMBER pro- 
gram package {igb = 5). These simulations were performed with different sets of 
randomly generated initial velocities. 

In Fig.|28] a helicity and strandness of two peptides obtained from the test sim- 
ulations are shown. We checked the secondary-structure formations by using the 
DSSP program B4l . which is based on the formations of the intra-backbone hy- 
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drogen bonds. For the original AMBER ff99SB force field, the a helicity is clearly 
larger than the strandness in not only C-peptide but also G-peptide. Namely, the 
original AMBER ff99SB force field clearly favors a-helix structure, and does not 
favor j3 structure. On the other hand, for the optimized force field, in the case of 
C-peptide, the a helicity is larger than the strandness, and in the case of G-peptide, 
the strandness is larger than the a helicity. We can see that these results obtained 
from the optimized force field are in better agreement with the experimental results 
in comparison with the original force field. 
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Fig. 28 a helicity (a-l) and strandness (a-2) of C-peptide and a helicity (b-1) and strandness (b- 
2) of G-peptide as functions of the residue number. These values are obtained from REMD 1591 
simulations at 300 K. Normal and dotted lines stand for the optimized and original AMBER ff99SB 
force field, respectively. 



4 Conclusions 

In this Chapter we reviewed our works on force fields for molecular simulations 
of protein systems. We first discussed the functional forms of the force fields and 
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present some extensions of the conventional ones. Because tiie main-ciiain torsion- 
energy terms are the most problematic among the force-field terms in the existing 
force fields, we mainly considered the main-chain torsion-energy terms. We have 
generalized them into the double Fourier series in (j) and \ir. We have also introduced 
the amino-acid dependence on these terms. 

Given the functional forms, we then presented various methods for force-field pa- 
rameter optimizations. Some of our methods use the coordinates from PDB, which 
were determined by experiments. We tried to minimize the effects of systematic 
experimental errors by considering many protein structures. Other methods rely on 
short molecular dynamics simulations with the native conformations from PDB as 
initial ones for the simulations. 

Some examples of our applications of these parameter optimization methods 
were given and they were compared with the results from the existing force-fields. It 
turned out that all the examples resulted in improvement of the existing force fields. 
We thus believe that we are at least on the right track. 

Our optimization methods for the force-field parameters are quite general and 
they can be readily applied to any new energy terms whenever they are introduced 
in the future. 

Acknowledgements The computations were performed on the computers at the Research Cen- 
ter for Computational Science, Institute for Molecular' Science, Information Technology Center, 
Nagoya University, and Center for Computational Sciences, University of Tsukuba. This work was 
supported, in part, by the Grants-in-Aid for the Academic Frontier Project, "Intelligent Information 
Science", for Scientific Research on Innovative Areas ("Fluctuations and Biological Functions" ), 
and for the Next Generation Super Computing Project, Nanoscience Program and Computational 
Materials Science Initiative from the Ministry of Education, Culture, Sports, Science and Technol- 
ogy (MEXT), lapan. 



References 

1. A. Liwo, C. Czaplewski, O. Stanislaw, H.A. Scheraga, Curr. Opin. Struct. Biol. 18, 134 (2008) 

2. H.A. Scheraga, Ann. Rev. Biophys. 40, 1 (2011) 

3. U.H.E. Hansmann, Y. Okamoto, Curr. Opin. Struct. Biol. 9, 177 (1999) 

4. A. Mitsutake, Y. Sugita, Y. Okamoto, Biopolymers 60, 96 (2001) 

5. Y. Okamoto, J. Mol. Graphics Modell. 22, 425 (2004) 

6. A. Mitsutake, Y. Mori, Y. Okamoto, in Biomolecular Simulations: Methods and Protocols, ed. 
by L. Monticelli, E. Salonen (Humana Press, Berlin, 2012), in press. 

7. W.D. Cornell, P Cieplak, C.I. Bayly, I.R. Gould, J. Kenneth M. Merz, D.M. Ferguson, D.C. 
Spellmeyer, T. Fox, J.W. Caldwell, PA. Kollman, J. Am. Chem. Soc. 117, 5179 (1995) 

8. PA. Kollman, R. Dixon, W. Cornell, T. Fox, C. Chipot, A. Pohorille, in Computer Simula- 
tions of Biological Systems, vol. 3, ed. by W.F. van Gunsteren, P.K. Weiner, A.J. Wilkinson 
(Kluwer/ESCOM, Dordrecht, 1997), pp. 83-96 

9. J. Wang, P Cieplak, PA. Kollman, J. Comput. Chem. 21, 1049 (2000) 

10. V. Homak, A. Abel, R. Okur, B. Strockbine, A. Roitberg, C. Simmerling, Proteins 65, 712 
(2006) 

11. Y. Duan, C. Wu, S. Chowdhury, M.C. Lee, G. Xiong, W. Zhang, R. Yang, P Cieplak, R. Luo, 
T. Lee, J. Comput. Chem. 24, 1999 (2003) 



Optimizations of protein force fields 



55 



12. A.D. MacKerell Jr, D. Bashford, M. Bellott, J. Dunbrack, R. L., J.D. Evanseck, M.J. Field, 
S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F.T.K. 
Lau, C. Mattos, S. Michnick, T. Ngo, D.T. Nguyen, B. Prodhom, I. Reiher, W. E., B. Roux, 
M. Schlenkrich, J.C. Smith, R. Stote, J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera, D. Yin, 
M. Karplus, J Phys Chem B 102, 3586 (1998) 

13. A. MacKerell Jr, M. Feig, C. Brooks III, J. Comput. Chem. 25, 1400 (2004) 

14. A. MacKerell Jr, M. Feig, C. Brooks III, J. Am. Chem. Soc. 126, 698 (2004) 

15. W.L. Jorgensen, D.S. Maxwell, J. Tirado-Rives, J. Am. Chem. Soc. 118, 11225 (1996) 

16. G.A. Kaminski, R.A. Friesner, J. Tirado-Rives, W.L. Jorgensen, J. Phys. Chem. B 105, 6474 
(2001) 

17. W.F. Gunsteren, S.R. Billeter, A.A. Eising, PH. Htinenberger, P Kriiger, A.E. Mark, W.R.P 
Scott, I.G. Tironi, (Vdf Hochschulverlag AG an der ETH Ziirich, Zurich, 1996) 

18. C. Oostenbrink, A. Villa, A.E. Mark, W.Fv. Gunsteren, J. Comput. Chem. 25, 1656 (2004) 

19. H.J.C. Berendsen, D. van der Spoel, R. van Drunen, Comput. Phys. Commun. 91, 43 (1995) 

20. E. Lindahl, B. Hess, D. van der Spoel, J. Mol. Model. 7, 306 (2001) 

21. G. Nemethy, K.D. Gibson, K.A. Palmer, C.N. Yoon, G. Paterlini, A. Zagaii, S. Rumsey, H.A. 
Scheraga, J. Phys. Chem. 96, 6472 (1992) 

22. Y.A. Amautova, A. Jagielska, H.A. Scheraga, J. Phys. Chem. B 110, 5025 (2006) 

23. T. Yoda, Y. Sugita, Y. Okamoto, Chem. Phys. Lett. 386, 460 (2004) 

24. T. Yoda, Y. Sugita, Y. Okamoto, Chem. Phys. 307, 269 (2004) 

25. Y. Sakae, Y. Okamoto, Chem. Phys. Lett. 382, 626 (2003) 

26. Y. Sakae, Y. Okamoto, J. Theo. Comput. Chem. 3, 339 (2004) 

27. Y. Sakae, Y. Okamoto, J. Theo. Comput. Chem. 3, 359 (2004) 

28. C. Simmerling, B. Strockbine, A.E. Roitberg, J. Am. Chem. Soc. 124, 11258 (2002) 

29. Y. Duan, C. Wu, S. Chowdhury, M.C. Lee, G. Xiong, W. Zhang, R. Yang, R Cieplak, R. Luo, 
T. Lee, J. Caldwell, J. Wang, P Kollman, J. Comput. Chem. 24, 1999 (2003) 

30. M. Iwaoka, S. Tomoda, J. Comput. Chem. 24, 1192 (2003) 

31. N. Kamiya, Y. Watanabe, S. Ono, J. Higo, Chem. Phys. Lett. 401, 312 (2005) 

32. R.B. Best, G. Hummer, J. Phys. Chem. B 113, 9004 (2009) 

33. J. Mittal, R.B. Best, Biophys. J. 99, L26 (2010) 

34. Y. Sakae, Y. Okamoto, J. Phys. Soc. Jpn. 75 (2006). 054802 (9 pages) 

35. Y. Sakae, Y. Okamoto, Mol. Sim. 36, 138 (2010) 

36. G.N. Ramachandran, V. Sasisekharan, Adv. Protein Chem. 23, 283 (1968) 

37. Y. Sakae, Y. Okamoto, Mol. Sim. 36, 159 (2010) 

38. Y. Sakae, Y. Okamoto, Mol. Sim. 36, 1 148 (2010) 

39. Y. Sakae, Y. Okamoto, e-print: arXiv: 1206.3909 [cond-mat.stat-mech!; submitted for publica- 
tion. 

40. Y. Sakae, Y. Okamoto, Mol. Sim. In press. 

41. W.C. Still, A. Tempczyk, R.C. Hawley, T. Hendrickson, J. Am. Chem. Soc. 112, 6127 (1990) 

42. D. Qiu, PS. Shenkin, FP HoUinger, W.C. Still, J. Phys. Chem. A 101, 3005 (1990) 

43. S. Kirkpatrick, CD. Gelatt Jr, M.P Vecchi, Science 220, 671 (1983) 

44. W. Kabsch, C. Sander, Biopolymers 22, 2577 (1983) 

45. Y. Sakae, Y. Okamoto, In preparation. 

46. S. Honda, N. Kobayashi, E. Munekata, J. Mol. Biol. 295, 269 (2000) 

47. K.R. Shoemaker, PS. Kim, D.N. Brems, S. Marqusee, E.J. York, I.M. Chaiken, J.M. Stewart, 
R.L. Baldwin, Proc. Natl. Acad. Sci. U.S.A. 82, 2349 (1985) 

48. J.J. Osterhout Jr, R.L. Baldwin, E.J. York, J.M. Stewart, H.J. Dyson, PE. Wright, Biochem- 
istry 28, 7059(1989) 

49. FJ. Blanco, G. Rivas, L. Serrano, Nature Struct. Biol. 1, 584 (1994) 

50. N. Kobayashi, S. Honda, H. Yoshii, H. Uedaira, E. Munek ata, FEES Lett. 366, 99 (19 95) 

51. Accelrys discovery studio visualizer Software available at http://www.accelrys.com/ 

52. H.J.C. Berendsen, J. P.M. Postma, W.F. van Gunsteren, A. DiNola, J.R. Haak, J. Chem. Phys. 
81, 3684 (1984) 

53. Tinker program package. Software available at |http://dasher. wustl.edu/tinker/ 

54. URL |http :77www . accelrys . com/ 1 



56 



Yoshitake Sakae and Yuko Okamoto 



55. T. Noguchi, K. Onizuka, Y. Akiyama, M. Saito, in Proc. of the Fifth International Conference 
on Intelligent Systems for Molecular Biology (AAAI press, Menlo Park, CA, 1997) 

56. D.A. Case, T. Cheatham, T. Darden, H. Gohlke, R. Luo, K.M. Merz, Jr., A. Onufriev, C. Sim- 
merUng, B. Wang, R. Woods, J. Computat. Chem. 26, 1668 (2005) 

57. A. Onufriev, D. Bashford, D.A. Case, Proteins 55, 383 (2004) 

58. J. Weiser, PS. Shenkin, W.C. Still, J. Comput. Chem. 20, 217 (1999) 

59. Y. Sugita, Y. Okamoto, Chem. Phys. Lett. 314, 141 (1999) 

60. J.P Ryckaert, G. Ciccotti, H.J.C. Berendsen, J. Comput. Phys. 23, 327 (1977) 

61. G. Wang, R.L.D. Jr, Bioinformatics 19, 1589 (2003) 

62. W.G. Hoover, Phys. Rev. A 31, 1695 (1985) 

63. W.L. Jorgensen, J. Tirado-Rives, J. Am. Chem. Soc. 110, 1657 (1988) 

64. M. Levitt, C. Chothia, Nature 261, 552 (1976) 



